-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MAINT] - Adress flakiness of Integration tests #2925
Comments
@viniciusdc I think the first case is related to #2947. However, I agree our tests seem to be flaky and that needs to be addressed. |
I recently noticed that there is another action that you can run with the jupyterhub/action-k8s-await-workloads@v3 and it allows you to inspect the affected pods (though usually we not need it since it generates too much data) for this specific error it allowed me finding a problem with promtail as seen bellow: ![]() ![]() Which is a know issue for running Kind: I think I addressed this in the past, but maybe with the new update to ubuntu 24.x #2958 this might've been removed. Since this is a bit different and mostly associated with the above update, I will open a new issue:
|
This is one workflow where we can see the above error message https://github.com/nebari-dev/nebari/actions/runs/13415146171/job/37478696321?pr=2965, and here is a second run with the update of inotify https://github.com/nebari-dev/nebari/actions/runs/13417860818/job/37483043593?pr=2965 |
Context
With the recent adoption of await workflow, which is a blessing since before we needed to include the kubectl command ourself anyways manually, we are getting some weird issues a few times with the image puller; it seems like it got stuck waiting for it in a couple of deployments, looks like a flaky behavior and requires further validation. There may be a need to increase the time limit or retries.
source: https://github.com/nebari-dev/nebari/actions/runs/12994981631/job/36240642535?pr=2924
Also, during releases, we have a hard time running CI against version bumps since, by common standard during the release workflow, we don't yet have the new images available, and the deployment fails under the check health status of the pods (namely jupyterhub)
source: https://github.com/nebari-dev/nebari/actions/runs/12952884533/job/36211476433?pr=2924
Value and/or benefit
Running/stable testing
Anything else?
No response
The text was updated successfully, but these errors were encountered: