HDDS-12589. Fix Incorrect FSO Key Listing for Container-to-Key Mapping. #8078
+240
−8
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
Problem:
When using FSO buckets, files with the same name uploaded into different directories were being merged into a single key record. This was because Recon’s container key mapping used only the volume, bucket, and file name as the unique identifier, which ignored the full directory path information.
Reproducing the Issue:
The issue can be reproduced by creating a nested directory structure and uploading two files (testfile1 and testfile2) at different directory depths. For example, run the following commands:
In this scenario, two duplicate file names (
testfile1
andtestfile2
) are created in three different directory hierarchies (dir1
,dir1/dir2
, anddir1/dir2/dir3
).Root Cause:
The root cause was that the Recon container key mapping computed a unique key based only on the volume, bucket, and file name. For FSO buckets, the directory structure is encoded as part of the raw key prefix (using negative object IDs), but this information was being omitted from the computed key. As a result, files with identical names from different directories were being incorrectly merged.
Fix:
The fix updates the container key mapping logic to use the raw key prefix from the container key table as the unique identifier. Since the raw key prefix includes the complete directory structure (with the object IDs representing the directories, volume, bucket), this change ensures that keys with the same file name but in different directories (as in the above scenario) are recognized as distinct records by Recon.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-12589
How was this patch tested?
ContainerKeyMapperTask
and thecontainer endpoint
to simulate duplicate FSO key names under different directories, ensuring that the raw key prefix is correctly used to differentiate the keys.