HDDS-12373. Change calculation logic for volume reserved space #7927

symious · 2025-02-19T07:00:56Z

What changes were proposed in this pull request?

The current logic for "hdds.datanode.dir.du.reserved" is as follows:

private long getRemainingReserved(){
 return Math.max(reservedInBytes - getOtherUsed(), 0L); 
}

which means if OtherUsed() is larger than reservedInBytes, then remainingReserved will be count to 0.

When we set a "hdds.datanode.dir.du.reserved" to 100GB, we actually want the disk to spare 100GB in case of "SPACE not enough exceptions".

But normally servers have a system level block reservation, which is 5%. So for a 10T disk, the system level reserved space is about 500GB, when we set the configuration to 100GB, the "remaningReserved" is calculated as 0, so for capacity and availabilty, reservation is not counted.

In current calculation logic ,in order to reserve a 100GB space, we need to set configuration to "600GB" (500 + 100) or "0.06" (0.05 + 0.01 for percent).

This ticket is to change the logic of the reservation calculation to have a more intuitive aspect for the users, thus no need to take care of the Other usages.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-12373

How was this patch tested?

unit test.

...ainer-service/src/main/java/org/apache/hadoop/ozone/container/common/volume/VolumeUsage.java

adoroszlai

Thanks @symious for the patch.

Tests run: 16, Failures: 2, Errors: 0, Skipped: 0, Time elapsed: 1.096 s <<< FAILURE! - in org.apache.hadoop.ozone.container.common.volume.TestHddsVolume

Please wait for clean CI run in fork before opening PR (or open it as draft).

...e/src/test/java/org/apache/hadoop/ozone/container/common/volume/TestReservedVolumeSpace.java

adoroszlai · 2025-02-19T09:16:37Z

Thanks @symious for updating the patch, LGTM. Added some other reviewers who worked on space logic previously.

...er-service/src/test/java/org/apache/hadoop/ozone/container/common/volume/TestHddsVolume.java

...ainer-service/src/main/java/org/apache/hadoop/ozone/container/common/volume/VolumeUsage.java

ChenSammi · 2025-02-21T08:35:18Z

One thing I just thought about is, with this new change, if user has ever set the "hdds.datanode.dir.du.reserved" or "hdds.datanode.dir.du.reserved.percent", for those non ozone services, will be too much from reservation point of view after Ozone version upgrade. For example, If Ozone DN is co-deployed with YARN service, which need disk space for shuffle data and all other data. User may set the "hdds.datanode.dir.du.reserved.percent" to 20% or 30%. Then it will be too much for this new calculation logic.

symious · 2025-02-21T09:27:35Z

User may set the "hdds.datanode.dir.du.reserved.percent" to 20% or 30%. Then it will be too much for this new calculation logic.

IMO, the reserved configuration is to make sure there are enough space left for Ozone usage, and for this goal, a 50GB space left would be enough. And Ozone should not worry about the other usages (YARN or system reserve), even Ozone spare a 20% reserve space for YARN, it's not guranteed that YARN will only use 20%.

It should be more safe for Ozone to always reserve a configured space regradless of other usages.

adoroszlai · 2025-02-21T10:57:02Z

IMO, the reserved configuration is to make sure there are enough space left for Ozone usage, and for this goal, a 50GB space left would be enough.

According to config doc, hdds.datanode.volume.min.free.space (and percent) is for Ozone usage (closing containers), and hdds.datanode.dir.du.reserved (inherited from HDFS) is for non-Ozone usage.

ozone/hadoop-hdds/common/src/main/resources/ozone-default.xml

Lines 197 to 201 in a2c5c8e

    
           <name>hdds.datanode.dir.du.reserved</name> 
        
           <value/> 
        
           <tag>OZONE, CONTAINER, STORAGE, MANAGEMENT</tag> 
        
           <description>Reserved space in bytes per volume. Always leave this much space free for non dfs use. 
        
              Such as /dir1:100B, /dir2:200MB, means dir1 reserves 100 bytes and dir2 reserves 200 MB.

ozone/hadoop-hdds/common/src/main/resources/ozone-default.xml

Lines 224 to 231 in a2c5c8e

    
           <name>hdds.datanode.volume.min.free.space</name> 
        
           <value>5GB</value> 
        
           <tag>OZONE, CONTAINER, STORAGE, MANAGEMENT</tag> 
        
           <description> 
        
             This determines the free space to be used for closing containers 
        
             When the difference between volume capacity and used reaches this number, 
        
             containers that reside on this volume will be closed and no new containers 
        
             would be allocated on this volume.

And Ozone should not worry about the other usages (YARN or system reserve), even Ozone spare a 20% reserve space for YARN, it's not guranteed that YARN will only use 20%.

I agree, maybe we should consider deprecating this setting?

adoroszlai

Thanks @symious for updating the patch, LGTM except an unused method.

So after this change, datanode disk capacity reported will exclude system reserved space.

...ainer-service/src/main/java/org/apache/hadoop/ozone/container/common/volume/VolumeUsage.java

ChenSammi · 2025-02-26T09:24:29Z

@symious , current we have

hdds.datanode.dir.du.reserved and hdds.datanode.dir.du.reserved.percent, similar properties inherited from HDFS, which defines the space left for non-Ozone use.
hdds.datanode.volume.min.free.space and hdds.datanode.volume.min.free.space.percent, introduced in HDDS-8254, which defines that if the volume free space reaches this value, volume will be treated as full.

I totally agree it's a good idea. Based the current state, I would propose instead of change hdds.datanode.dir.du.reserved and hdds.datanode.dir.du.reserved.percent concepts, reuse hdds.datanode.volume.min.free.space and hdds.datanode.volume.min.free.space.percent. Make the SCM pipeline allocation, container allocation, container candidate selection in container balancer, container selection in replication manager, all aware of this min.free.space, so that disk full can be avoided starting from SCM. What do you think @symious?

Also current hdds.datanode.volume.min.free.space and hdds.datanode.volume.min.free.space.percent have a quite small default value, we should consider increase that.

cc @sadanand48, @vtutrinov .

symious · 2025-02-27T07:12:18Z

Make the SCM pipeline allocation, container allocation, container candidate selection in container balancer, container selection in replication manager, all aware of this min.free.space, so that disk full can be avoided starting from SCM.

@ChenSammi Are you suggesting to use "min.free.space" instead of "dir.du.reserved"?

ChenSammi · 2025-02-27T10:14:27Z

Make the SCM pipeline allocation, container allocation, container candidate selection in container balancer, container selection in replication manager, all aware of this min.free.space, so that disk full can be avoided starting from SCM.

@ChenSammi Are you suggesting to use "min.free.space" instead of "dir.du.reserved"?

Yes, I'm suggesting to use hdds.datanode.volume.min.free.space and hdds.datanode.volume.min.free.space.percent to achieve the same goal.

symious · 2025-03-13T06:58:31Z

Disscussed with @ChenSammi , it's better to keep the logic of the "du.reserved" config, so we only change log and remove the system reserved from calculation here.

@ChenSammi @adoroszlai @sadanand48 PTAL.

siddhantsangwan · 2025-03-14T08:41:36Z

@symious I've also been looking into this area and I'm planning to review this pull request soon.

ChenSammi reviewed Feb 19, 2025

View reviewed changes

...ainer-service/src/main/java/org/apache/hadoop/ozone/container/common/volume/VolumeUsage.java Outdated Show resolved Hide resolved

adoroszlai reviewed Feb 19, 2025

View reviewed changes

...e/src/test/java/org/apache/hadoop/ozone/container/common/volume/TestReservedVolumeSpace.java Outdated Show resolved Hide resolved

...e/src/test/java/org/apache/hadoop/ozone/container/common/volume/TestReservedVolumeSpace.java Outdated Show resolved Hide resolved

adoroszlai marked this pull request as draft February 19, 2025 08:34

adoroszlai requested review from aswinshakil, sadanand48 and xichen01 February 19, 2025 09:03

sadanand48 reviewed Feb 19, 2025

View reviewed changes

...er-service/src/test/java/org/apache/hadoop/ozone/container/common/volume/TestHddsVolume.java Outdated Show resolved Hide resolved

sadanand48 requested a review from ashishkumar50 February 19, 2025 11:11

adoroszlai reviewed Feb 19, 2025

View reviewed changes

...ainer-service/src/main/java/org/apache/hadoop/ozone/container/common/volume/VolumeUsage.java Outdated Show resolved Hide resolved

adoroszlai reviewed Feb 24, 2025

View reviewed changes

...ainer-service/src/main/java/org/apache/hadoop/ozone/container/common/volume/VolumeUsage.java Outdated Show resolved Hide resolved

siddhantsangwan self-requested a review February 25, 2025 08:58

HDDS-12473. Remove system reserved space from calculation

b0151d7

symious force-pushed the HDDS-12373 branch from 0515151 to b0151d7 Compare March 13, 2025 06:56

adoroszlai requested a review from sumitagrawl March 13, 2025 07:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HDDS-12373. Change calculation logic for volume reserved space #7927

HDDS-12373. Change calculation logic for volume reserved space #7927

symious commented Feb 19, 2025

adoroszlai left a comment

adoroszlai commented Feb 19, 2025

ChenSammi commented Feb 21, 2025

symious commented Feb 21, 2025

adoroszlai commented Feb 21, 2025 •

edited

Loading

adoroszlai left a comment

ChenSammi commented Feb 26, 2025 •

edited

Loading

symious commented Feb 27, 2025

ChenSammi commented Feb 27, 2025 •

edited

Loading

symious commented Mar 13, 2025

siddhantsangwan commented Mar 14, 2025

HDDS-12373. Change calculation logic for volume reserved space #7927

Are you sure you want to change the base?

HDDS-12373. Change calculation logic for volume reserved space #7927

Conversation

symious commented Feb 19, 2025

What changes were proposed in this pull request?

What is the link to the Apache JIRA

How was this patch tested?

adoroszlai left a comment

Choose a reason for hiding this comment

adoroszlai commented Feb 19, 2025

ChenSammi commented Feb 21, 2025

symious commented Feb 21, 2025

adoroszlai commented Feb 21, 2025 • edited Loading

adoroszlai left a comment

Choose a reason for hiding this comment

ChenSammi commented Feb 26, 2025 • edited Loading

symious commented Feb 27, 2025

ChenSammi commented Feb 27, 2025 • edited Loading

symious commented Mar 13, 2025

siddhantsangwan commented Mar 14, 2025

adoroszlai commented Feb 21, 2025 •

edited

Loading

ChenSammi commented Feb 26, 2025 •

edited

Loading

ChenSammi commented Feb 27, 2025 •

edited

Loading