Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poll Based Waiting for Job Completion #670

Merged
merged 9 commits into from
Aug 29, 2024

Conversation

MattToast
Copy link
Member

Experiment was given a wait method that takes a collection of Launched Job IDs and will wait until the launch reaches a terminal state by either completing or erroring out. Implements a polling based solution.

@MattToast MattToast self-assigned this Aug 15, 2024
@MattToast MattToast added type: feature Issues that include feature request or feature idea area: api Issues related to API changes type: usability Issues related to ease of use labels Aug 15, 2024
`Experiment` was given a `wait` method that takes a collection of
Launched Job IDs and will wait until the launch reaches a terminal
state by either completing or erroring out. Implements a polling based
solution.
Copy link

codecov bot commented Aug 15, 2024

Codecov Report

Attention: Patch coverage is 97.36842% with 2 lines in your changes missing coverage. Please review.

Project coverage is 43.28%. Comparing base (cce16e6) to head (d1eafc8).
Report is 19 commits behind head on smartsim-refactor.

Files with missing lines Patch % Lines
smartsim/_core/utils/helpers.py 77.77% 2 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@                  Coverage Diff                  @@
##           smartsim-refactor     #670      +/-   ##
=====================================================
+ Coverage              40.45%   43.28%   +2.82%     
=====================================================
  Files                    110      110              
  Lines                   7326     7053     -273     
=====================================================
+ Hits                    2964     3053      +89     
+ Misses                  4362     4000     -362     
Files with missing lines Coverage Δ
smartsim/_core/control/interval.py 100.00% <100.00%> (ø)
smartsim/experiment.py 84.84% <100.00%> (+3.29%) ⬆️
smartsim/_core/utils/helpers.py 39.40% <77.77%> (-0.78%) ⬇️

... and 5 files with indirect coverage changes

Copy link
Contributor

@amandarichardsonn amandarichardsonn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the mystical powers vested in me, I declare this pull request to be a work of sheer brilliance!

:param timeout: The minimum amount of time to spend polling all jobs to
reach one of the supplied statuses. If not supplied or `None`, the
experiment will poll indefinitely.
:param interval: The minimum time between polling launchers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is param interval just for us to use for testing? Seems like user cannot define

Copy link
Member Author

@MattToast MattToast Aug 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly! It was something that, right now, could be hard coded, but if in the future we wanted to make it variable we can change the parameter. Totally willing to remove if we think the excess complexity is unnecessary in a YAGNI way!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't mind leaving it in and keeping it in the docs

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I say keep if but not in docstring

Copy link
Contributor

@juliaputko juliaputko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looooooks great, lmk if there is anything you want me to take a closer look out and I can test out some stuff :)

:param timeout: The minimum amount of time to spend polling all jobs to
reach one of the supplied statuses. If not supplied or `None`, the
experiment will poll indefinitely.
:param interval: The minimum time between polling launchers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't mind leaving it in and keeping it in the docs

Copy link
Contributor

@mellis13 mellis13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only one small question/comment otherwise LGTM!

Copy link
Contributor

@amandarichardsonn amandarichardsonn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for adding all the doc strings to the tests!

@MattToast MattToast merged commit 5611a16 into CrayLabs:smartsim-refactor Aug 29, 2024
36 of 37 checks passed
@MattToast MattToast deleted the wait-for-job-end branch August 30, 2024 23:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: api Issues related to API changes ignore-for-release type: feature Issues that include feature request or feature idea type: usability Issues related to ease of use
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants