Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the "auto-detect the current allocation" feature #234

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

DilumAluthge
Copy link
Member

Most of the time, I think that users will only be working with a single cluster scheduler, and thus they can just use the relevant ClusterManager directly.

However, in some situations, I think that it might be useful to be able to write a single script that is agnostic to the specific cluster scheduler. In those cases, it would be nice to auto-detect which cluster scheduler is active, and then automatically use the correct ClusterManager.

This PR adds an experimental non-public addprocs_autodetect_current_scheduler() function that implements this.

Note: The addprocs_autodetect_current_scheduler() function should be run from inside an active allocation. So e.g. for Slurm, you would first get an allocation (e.g. sbatch or salloc), and then run this function inside the allocation.

@Moelf
Copy link
Collaborator

Moelf commented Feb 3, 2025

if we have auto-detect in this package, but certain cluster backends live in separate packages (e.g. LSF), what should user do?

I imagine we might want to further split backends into their own packages but then the auto-detect would be even less useful.

Is the long-term goal to make this package an umbrella package? If so, maybe this pkg should (optionally) depend on LSF?

@DilumAluthge
Copy link
Member Author

All good questions. At this point this PR is definitely very speculative, so everything is still up in the air.

At this point I'm not even sure if auto-detect is a good idea.

But if we do pursue auto-detect, then I think that yes, ClusterManagers.jl would need to take direct dependencies on LSFClusterManager.jl, SlurmClusterManager.jl, and any other external packages.

@DilumAluthge DilumAluthge force-pushed the dpa/auto_detect branch 4 times, most recently from 588479d to 223da63 Compare February 10, 2025 00:21
Copy link

codecov bot commented Feb 10, 2025

Codecov Report

Attention: Patch coverage is 0% with 70 lines in your changes missing coverage. Please review.

Project coverage is 32.10%. Comparing base (a701c48) to head (2777104).

Files with missing lines Patch % Lines
src/auto_detect.jl 0.00% 70 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #234      +/-   ##
==========================================
- Coverage   37.85%   32.10%   -5.75%     
==========================================
  Files           7        8       +1     
  Lines         391      461      +70     
==========================================
  Hits          148      148              
- Misses        243      313      +70     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants