-
Notifications
You must be signed in to change notification settings - Fork 526
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDDS-8660. Notify ReplicationManager when nodes go dead or out of service #7997
base: master
Are you sure you want to change the base?
Conversation
Temporarily convert to a draft, as I would like to introduce |
8c56c07
to
bc7aa2f
Compare
Hi @adoroszlai @sodonnel, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@peterxcli Thanks for your patch, overall looks good. A few comments for your reference.
maybeNotifyReplicationManager(reportedDn, oldPersistedOpState, newPersistedOpState); | ||
} | ||
|
||
private void maybeNotifyReplicationManager( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should only notify when current SCM is leader.
// Only wake up the thread if there's no active replication work | ||
// This prevents creating a new replication queue over and over | ||
// when multiple nodes change state in quick succession | ||
if (getQueue().isEmpty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can check isThreadWaiting
too, only the wait
thread can be notified.
// Notify ReplicationManager | ||
LOG.info("Notifying ReplicationManager about dead node: {}", | ||
datanodeDetails); | ||
publisher.fireEvent(SCMEvents.REPLICATION_MANAGER_NOTIFY, datanodeDetails); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can only notify when this node is not IN_MAINTENANCE status.
} | ||
|
||
@Override | ||
public void onMessage(DatanodeDetails datanodeDetails, EventPublisher eventPublisher) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A leader check should be added.
@xichen01 Thanks for the review, addressed all comments, please take a look. Thanks! |
What changes were proposed in this pull request?
If someone triggers decommission / maintenance, there is potentially a 5 minute lag from the decommission process starting and RM noticing that containers need replication, due to RM running on a 5 minute interval. Similarly, if a node goes dead, it has already been gone for 10 minutes, and it will take up to another 5 minutes for RM to notice and process the containers.
It would be good to notify the RM thread to wake it up when these events happen to reduce the time it takes to start to repair the problem.
One thing that comes to mind about for any solution, is that RM operates by:
If a dead node happens at during step 2, and we notify the thread, it will already be running so the notify will not do anything. It may be that some of the containers from the node in question have been processed already, or they may still to be processed - we don't really know. Perhaps this is OK, rather than complicating the solution, as in general fixing decommission or under-replication will take a long time.
It is also possible that several nodes go dead in quick succession, or several nodes go out of service quickly, resulting in several notify calls occurring. We don't want to wake up the thread too frequently if this happens, as it will result in a new replication queue getting created over and over. Perhaps if the queue is not empty, then there is replication work to do, and we should not run again.
Finally, we might want to consider notifying on a node coming back into service, as that could cause over-replication. However over-replication is not as big of a problem as under-replication if it is not addressed quickly.
What has been done?
DeadNodeHandler
would send a "REPLICATION_MANAGER_NOTIFY" message to event queueSCMNodeManager
would send a "REPLICATION_MANAGER_NOTIFY" message to event queueWhat is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-8660
How was this patch tested?
ReplicationManager
notify call forNodeManger
andDeadNodeHandler
ReplicationMangerIntegration
to test if theReplicationManger
would be notified and start to process over/under replicated container when the persisted op state or health state of nodes changed.CI:
https://github.com/peterxcli/ozone/actions/runs/13785637929https://github.com/peterxcli/ozone/actions/runs/13885375401