Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDDS-12060. Replace System.currentTimeMillis() with Time.monotonicNow() for duration calculation #8096

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

chiacyu
Copy link
Contributor

@chiacyu chiacyu commented Mar 16, 2025

What changes were proposed in this pull request?

A lot of duration calculations are still using System.currentTimeMillis(). However, from the description of Time.now (which calls System.currentTimeMillis()

Current system time. Do not use this to calculate a duration or interval to sleep, because it will be broken by settimeofday. Instead, use monotonicNow.

Therefore, we should replace them with Time.monotonicNow() / Time.monotonicNowNanos() to prevent the duration calculation being broken by settimeofday.

What is the link to the Apache JIRA

HDDS-12060

How was this patch tested?

Tested by CI

@chiacyu
Copy link
Contributor Author

chiacyu commented Mar 16, 2025

Hi, @ivandika3
Please take a look while you're available, thanks!

long currentTime = System.currentTimeMillis();
long existingCount = processExistingDBRecords(currentTime,
long currentTime = Time.monotonicNow();
long existingCount = processExistingDBRecords(Time.monotonicNow(),
Copy link
Contributor

@peterxcli peterxcli Mar 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this(change in line 150) also used for duration calculation? If I'm wrong, please correct me. Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @peterxcli
Thanks for the review and comments. In the following line the start would be subtracted to estimate the duration, please take a look and tell me what do you think. Thanks!

      LOG.debug("Container Health task thread took {} milliseconds to" +
              " process {} existing database records.",
          Time.monotonicNow() - start, existingCount);

Copy link
Contributor

@ivandika3 ivandika3 Mar 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chiacyu As @peterxcli pointed out, start already uses Time.monotonicNow. The currentTime is used for the UnhealthyContainers.inStateSince which will be displayed in Recon UI. So currentTime should still use System.currentTimeMillis .

Please kindly revert this change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please revert both lines, not only currentTime.

Copy link
Contributor

@ivandika3 ivandika3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @chiacyu for taking this up. Left a comment regarding issue pointed out by @peterxcli

long currentTime = System.currentTimeMillis();
long existingCount = processExistingDBRecords(currentTime,
long currentTime = Time.monotonicNow();
long existingCount = processExistingDBRecords(Time.monotonicNow(),
Copy link
Contributor

@ivandika3 ivandika3 Mar 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chiacyu As @peterxcli pointed out, start already uses Time.monotonicNow. The currentTime is used for the UnhealthyContainers.inStateSince which will be displayed in Recon UI. So currentTime should still use System.currentTimeMillis .

Please kindly revert this change.

@ivandika3 ivandika3 requested a review from szetszwo March 16, 2025 11:34
Copy link
Contributor

@ivandika3 ivandika3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chiacyu Thanks for the update. LGTM +1.

Copy link
Contributor

@peterxcli peterxcli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ivandika3 for the explanation and @chiacyu for updating the patch. Sorry for my previous comment is not clear enough. Left a new comment, please also take a look. Thanks!

@@ -147,7 +147,7 @@ protected void runTask() throws Exception {
unhealthyContainerStateStatsMap);
long start = Time.monotonicNow();
long currentTime = System.currentTimeMillis();
long existingCount = processExistingDBRecords(currentTime,
long existingCount = processExistingDBRecords(Time.monotonicNow(),
Copy link
Contributor

@peterxcli peterxcli Mar 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After tracing down the code, I think the currentTime param of processExistingDBRecords is not used for calculate duration, it just store it with UnhealthyContainersRecord in recon db.

Suggested change
long existingCount = processExistingDBRecords(Time.monotonicNow(),
long existingCount = processExistingDBRecords(currentTime,

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chiacyu Please revert both of the lines, not only currentTime.

Copy link
Contributor

@szetszwo szetszwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chiacyu , thanks a lot for working on this! The change looks good. Just have two questions inlined.

BTW, I suggest that this PR only changes the trivial replacements, i.e. when the time is used locally in a method for computing elapsed time. For the other more complicated that the time is stored in a class and it is used it multiple places, we can fix it in separated JIRAs. Such cases need to be reviewed carefully.

@@ -123,7 +124,7 @@ public boolean operationalStateExpired() {
if (0 == opStateExpiryEpochSeconds) {
return false;
}
return System.currentTimeMillis() / 1000 >= opStateExpiryEpochSeconds;
return Time.monotonicNow() / 1000 >= opStateExpiryEpochSeconds;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will NodeStatus be stored in disk? If it will, then we cannot use Time.monotonicNow() for opStateExpiryEpochSeconds since it does use unix epoch. It won't work after the server restart.

@@ -147,7 +147,7 @@ protected void runTask() throws Exception {
unhealthyContainerStateStatsMap);
long start = Time.monotonicNow();
long currentTime = System.currentTimeMillis();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

currentTime is still using System.currentTimeMillis(). It will be passed to ContainerHealthRecords.generateUnhealthyRecords(. Not sure if it should be changed to Time.monotonicNow().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants