Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Graceful Shutdown for clearml-agent daemon #229

Open
nsckir opened this issue Mar 6, 2025 · 0 comments
Open

Graceful Shutdown for clearml-agent daemon #229

nsckir opened this issue Mar 6, 2025 · 0 comments

Comments

@nsckir
Copy link

nsckir commented Mar 6, 2025

Currently, using clearml-agent daemon --stop terminates the agent abruptly and aborts the running task.
I’d like a way to ensure the agent finishes its current task and then stops without picking up new tasks from the queue.

This would be useful for scenarios where I need to free up resources (e.g., a GPU) for manual work without disrupting ongoing jobs.

I tried one approach: setting agent.reload_config: true in the configuration file, hoping the agent would reload its config between tasks. Then, when I need the agent to stop, I’d define agent.downtime in the config to pause it. However, this doesn’t seem to work—the config isn’t reloaded dynamically between tasks as I thought.

I also tried modifying the queues of a worker via API. My idea was that if I could remove the default queue or exchange it for some dummy queue with no tasks then the worker would naturally stop after it has finished the current task.
But it seems that it is not possible either.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant