Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDDS-12495. Add metadata flag to check block existence in ozone debug verify replicas #8079

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

sarvekshayr
Copy link
Contributor

What changes were proposed in this pull request?

Introduced a --metadata flag under ozone debug replicas verify command to check for block existence using GetBlock calls to the datanodes. For each key, it iterates through all replicas and verifies block presence.

What is the link to the Apache JIRA

HDDS-12495

How was this patch tested?

Tested the patch on a docker cluster.

  1. When all the blocks exist for a given key, it will be indicated with "status": "BLOCK_EXISTS".
bash-5.1$ ozone debug replicas verify --metadata / --output-dir /tmp | jq  
{
  "key": "ockrwvolume/ockrwbucket/vnmnn1ltsx/1679091",
  "blockID": "conID: 1 locID: 115816896921600007 bcsId: 25 replicaIndex: null",
  "status": "BLOCK_EXISTS",
  "pass": true
}
{
  "key": "ockrwvolume/ockrwbucket/vnmnn1ltsx/45c48cc",
  "blockID": "conID: 1 locID: 115816896921600010 bcsId: 38 replicaIndex: null",
  "status": "BLOCK_EXISTS",
  "pass": true
}
{
  "key": "ockrwvolume/ockrwbucket/vnmnn1ltsx/8f14e45",
  "blockID": "conID: 1 locID: 115816896921600008 bcsId: 29 replicaIndex: null",
  "status": "BLOCK_EXISTS",
  "pass": true
}

  1. If some blocks are missing, it will be indicated with "status": "MISSING".
bash-5.1$ ozone debug replicas verify --metadata / --output-dir /tmp | jq
{
  "key": "ockrwvolume/ockrwbucket/vnmnn1ltsx/1679091",
  "status": "MISSING",
  "pass": false
}
{   
  "key": "ockrwvolume/ockrwbucket/vnmnn1ltsx/45c48cc",
  "status": "MISSING",
  "pass": false
}
{
  "key": "ockrwvolume/ockrwbucket/vnmnn1ltsx/8f14e45",
  "status": "MISSING",
  "pass": false
}
  1. If some error is encountered while fetching details, it throws "status": "ERROR" along with the error message.
bash-5.1$ ozone debug replicas verify --metadata / --output-dir /tmp | jq
{
  "key": "ockrwvolume/ockrwbucket/vnmnn1ltsx/45c48cc",
  "status": "ERROR",
  "message": "No Route to Host from  10a39f1ddd74/172.31.0.7 to scm:9860 failed on socket timeout exception: java.net.NoRouteToHostException: No route to host; For more details see:  http://wiki.apache.org/hadoop/NoRouteToHost",
  "pass": false
}

for (Map.Entry<DatanodeDetails, ContainerProtos.GetBlockResponseProto> entry : responses.entrySet()) {
if (entry.getValue() != null && entry.getValue().hasBlockData()) {
printJsonResult(keyDetails, "BLOCK_EXISTS", keyLocation.getBlockID().toString(), true, result);
return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will return as soon as it finds any replica's blocks ie If blocks are missing on 2/3 nodes this will still print BLOCK_EXISTS. Is this behaviour intended ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That shouldn't be the behaviour, I will fix it.

@sadanand48
Copy link
Contributor

This is exactly what ozone debug chunkinfo does i.e gets block info from the OM and performs getBlock from all its replica in the pipeline and prints it. Why add another tool to do the same thing? If the block info is missing from any node in the debug chunkinfo output it means MISSING else BLOCK_EXISTS

@sarvekshayr
Copy link
Contributor Author

This is exactly what ozone debug chunkinfo does i.e gets block info from the OM and performs getBlock from all its replica in the pipeline and prints it. Why add another tool to do the same thing? If the block info is missing from any node in the debug chunkinfo output it means MISSING else BLOCK_EXISTS

Both the commands do have similar functionality.
@errose28 please take a look.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants