Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDDS-12589. Fix Incorrect FSO Key Listing for Container-to-Key Mapping. #8078

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ArafatKhan2198
Copy link
Contributor

What changes were proposed in this pull request?

Problem:
When using FSO buckets, files with the same name uploaded into different directories were being merged into a single key record. This was because Recon’s container key mapping used only the volume, bucket, and file name as the unique identifier, which ignored the full directory path information.

Reproducing the Issue:
The issue can be reproduced by creating a nested directory structure and uploading two files (testfile1 and testfile2) at different directory depths. For example, run the following commands:

ozone fs -mkdir -p ofs://om/volume1/fso-bucket/dir1/dir2/dir3
ozone fs -put -f testfile1 ofs://om/volume1/fso-bucket/dir1/
ozone fs -put -f testfile2 ofs://om/volume1/fso-bucket/dir1/
ozone fs -put -f testfile1 ofs://om/volume1/fso-bucket/dir1/dir2/
ozone fs -put -f testfile2 ofs://om/volume1/fso-bucket/dir1/dir2/
ozone fs -put -f testfile1 ofs://om/volume1/fso-bucket/dir1/dir2/dir3/
ozone fs -put -f testfile2 ofs://om/volume1/fso-bucket/dir1/dir2/dir3/

In this scenario, two duplicate file names (testfile1 and testfile2) are created in three different directory hierarchies (dir1, dir1/dir2, and dir1/dir2/dir3).

Root Cause:
The root cause was that the Recon container key mapping computed a unique key based only on the volume, bucket, and file name. For FSO buckets, the directory structure is encoded as part of the raw key prefix (using negative object IDs), but this information was being omitted from the computed key. As a result, files with identical names from different directories were being incorrectly merged.

Fix:
The fix updates the container key mapping logic to use the raw key prefix from the container key table as the unique identifier. Since the raw key prefix includes the complete directory structure (with the object IDs representing the directories, volume, bucket), this change ensures that keys with the same file name but in different directories (as in the above scenario) are recognized as distinct records by Recon.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-12589

How was this patch tested?

  • I manually verified the fix by executing the above commands, which created duplicate files (testfile1 and testfile2) under different directory hierarchies, and confirmed that the container endpoint returned separate records for each file.
  • Additionally, I wrote unit tests for both the ContainerKeyMapperTask and the container endpoint to simulate duplicate FSO key names under different directories, ensuring that the raw key prefix is correctly used to differentiate the keys.

Copy link
Contributor

@devmadhuu devmadhuu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ArafatKhan2198 for the patch. Changes LGTM +1. But as discussed kindly take care of pagination issue.

@ArafatKhan2198
Copy link
Contributor Author

@devmadhuu as discussed with @devabhishekpal we will be incorporating the UI changes in a separate PR.

@ArafatKhan2198 ArafatKhan2198 marked this pull request as ready for review March 14, 2025 11:02
@devmadhuu devmadhuu self-requested a review March 14, 2025 13:41
Copy link
Contributor

@devmadhuu devmadhuu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ArafatKhan2198 for the patch. Changes LGTM +1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants