Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDDS-8660. Notify ReplicationManager when nodes go dead or out of service #7997

Open
wants to merge 15 commits into
base: master
Choose a base branch
from

Conversation

peterxcli
Copy link
Contributor

@peterxcli peterxcli commented Mar 2, 2025

What changes were proposed in this pull request?

If someone triggers decommission / maintenance, there is potentially a 5 minute lag from the decommission process starting and RM noticing that containers need replication, due to RM running on a 5 minute interval. Similarly, if a node goes dead, it has already been gone for 10 minutes, and it will take up to another 5 minutes for RM to notice and process the containers.

It would be good to notify the RM thread to wake it up when these events happen to reduce the time it takes to start to repair the problem.

One thing that comes to mind about for any solution, is that RM operates by:

  1. Getting a list of all containers.
  2. Processing the list
  3. Sleeping for 5 minutes.

If a dead node happens at during step 2, and we notify the thread, it will already be running so the notify will not do anything. It may be that some of the containers from the node in question have been processed already, or they may still to be processed - we don't really know. Perhaps this is OK, rather than complicating the solution, as in general fixing decommission or under-replication will take a long time.

It is also possible that several nodes go dead in quick succession, or several nodes go out of service quickly, resulting in several notify calls occurring. We don't want to wake up the thread too frequently if this happens, as it will result in a new replication queue getting created over and over. Perhaps if the queue is not empty, then there is replication work to do, and we should not run again.

Finally, we might want to consider notifying on a node coming back into service, as that could cause over-replication. However over-replication is not as big of a problem as under-replication if it is not addressed quickly.

What has been done?

  • Add ReplicatonManagerEventHandler to handle message with "REPLICATION_MANAGER_NOTIFY" then notify RM's thread
  • When one node go to dead, DeadNodeHandler would send a "REPLICATION_MANAGER_NOTIFY" message to event queue
  • When the persisted op state of one node changed after receiving DN report, SCMNodeManager would send a "REPLICATION_MANAGER_NOTIFY" message to event queue

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-8660

How was this patch tested?

  • unit tests to test the ReplicationManager notify call for NodeManger and DeadNodeHandler
  • Add ReplicationMangerIntegration to test if the ReplicationManger would be notified and start to process over/under replicated container when the persisted op state or health state of nodes changed.

CI:
https://github.com/peterxcli/ozone/actions/runs/13785637929
https://github.com/peterxcli/ozone/actions/runs/13885375401

@adoroszlai adoroszlai requested a review from sodonnel March 3, 2025 21:35
@peterxcli
Copy link
Contributor Author

Temporarily convert to a draft, as I would like to introduce ReplicationActionHandler to act as an event subscriber and move all notify calls on ReplicationManager to go through the eventQueue.

@peterxcli peterxcli marked this pull request as draft March 6, 2025 04:12
@peterxcli peterxcli force-pushed the hdds8660-ReplicationManager-Notify-when-dead-nodes-or-nodes-go-out-of-service branch from 8c56c07 to bc7aa2f Compare March 8, 2025 13:16
@peterxcli peterxcli marked this pull request as ready for review March 11, 2025 11:29
@peterxcli
Copy link
Contributor Author

Hi @adoroszlai @sodonnel,
This PR is ready for review. Whenever you have time, please take a look. Thanks!
I’ve also added more details to the description.

@adoroszlai adoroszlai changed the title HDDS-8660. ReplicationManager: Notify when dead nodes or nodes go out of service HDDS-8660. Notify ReplicationManager when nodes go dead or out of service Mar 13, 2025
Copy link
Contributor

@xichen01 xichen01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@peterxcli Thanks for your patch, overall looks good. A few comments for your reference.

maybeNotifyReplicationManager(reportedDn, oldPersistedOpState, newPersistedOpState);
}

private void maybeNotifyReplicationManager(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should only notify when current SCM is leader.

// Only wake up the thread if there's no active replication work
// This prevents creating a new replication queue over and over
// when multiple nodes change state in quick succession
if (getQueue().isEmpty()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can check isThreadWaiting too, only the wait thread can be notified.

// Notify ReplicationManager
LOG.info("Notifying ReplicationManager about dead node: {}",
datanodeDetails);
publisher.fireEvent(SCMEvents.REPLICATION_MANAGER_NOTIFY, datanodeDetails);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can only notify when this node is not IN_MAINTENANCE status.

}

@Override
public void onMessage(DatanodeDetails datanodeDetails, EventPublisher eventPublisher) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A leader check should be added.

@peterxcli peterxcli requested a review from xichen01 March 16, 2025 22:37
@peterxcli
Copy link
Contributor Author

@xichen01 Thanks for the review, addressed all comments, please take a look. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants