-
Notifications
You must be signed in to change notification settings - Fork 640
Fix flaky of test_cancel_launch_and_exec_async
#5456
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix flaky of test_cancel_launch_and_exec_async
#5456
Conversation
/smoke-test --kubernetes -k test_cancel_launch_and_exec_async |
/smoke-test --kubernetes -k test_cancel_launch_and_exec_async |
test_cancel_launch_and_exec_async
test_cancel_launch_and_exec_async
@zpoint It's possible that this was the intended behavior, but I don't know. Can we confirm this behavior in the past (e.g. in 0.8)? Just want to double check that this isn't actually a regression. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
otherwise, code changes look good
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good, thanks for looking into it!
/smoke-test --aws -k test_cancel_launch_and_exec_async |
/smoke-test --gcp -k test_cancel_launch_and_exec_async |
/smoke-test --azure -k test_cancel_launch_and_exec_async |
Resolve #5436
The issue only occurs in the
UP
state. If we call sky down then, the job won't show as cancelled, and thegrep "cancelled"
command will fail.When it's in
INIT
state, the test succeeds. This failure is more likely to happen in kubernetes tests because the kubernetes cluster provisions too quickly to reach theUP
state.I'm not sure if fixing the test case is the right approach, or if we should modify thesky
job system to also cancel jobs for clusters in theUP
state whensky down
called? @ayleiTested (run the relevant ones):
bash format.sh
/smoke-test --kubernetes -k test_cancel_launch_and_exec_async
(CI) orpytest tests/test_smoke.py::test_name
(local)