Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unify computation of number of parallel jobs #33317

Open
tobiasdiez opened this issue Feb 9, 2022 · 18 comments
Open

Unify computation of number of parallel jobs #33317

tobiasdiez opened this issue Feb 9, 2022 · 18 comments

Comments

@tobiasdiez
Copy link
Contributor

Currently, the number of jobs used for parallel operations is re-computed and set in various places. In this ticket, we unify this to one central place in sage.env, which is then reused across the whole infrastructure. The only exception is the script build/bin/sage-build-num-threads which is only needed in the build of the sage-conf package. This can probably be improved as well, but we leave it for a follow-up ticket.

Changes in detail:

  • Remove SAGE_NUM_THREADS_PARALLEL since it was only used for the docbuild.
  • Introduce the method thread_count in sage.env that based on the environment variable SAGE_NUM_THREADS and the number of CPUs computes the number of parallel jobs sage should use.
  • Use this method (or its cached value THREAD_COUNT) everywhere where previously SAGE_NUM_THREADS has been used.
  • Remove a few places that also calculated the number of CPUs or number of parallel jobs.
  • In particular, remove the computation of the number of threads to use in the make files. This is now exclusively handled through sage.env.
  • Don't specify SAGE_NUM_THREADS on CI, but let this be calculated using the number of CPUs available.

CC: @mkoeppe @jhpalmieri @tscrim @nthiery @fchapoton

Component: build

Branch/Commit: public/build/num_threads @ af4c3e2

Issue created by migration from https://trac.sagemath.org/ticket/33317

@tobiasdiez tobiasdiez added this to the sage-9.6 milestone Feb 9, 2022
@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Feb 9, 2022

Changed commit from 97ebc34 to af4c3e2

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Feb 9, 2022

Branch pushed to git repo; I updated commit sha1. New commits:

af4c3e2Fix style

@jhpalmieri
Copy link
Member

comment:3

Does this duplicate #30639? If so, let's close #30639, since there is no branch there.

@jhpalmieri
Copy link
Member

comment:4

If I'm reading the old code right, the default number of threads was 2, but you've changed it to 10. At least one reason to keep the default low was for the case of shared machines, where a single user should only use many cores at a time intentionally.

@tobiasdiez
Copy link
Contributor Author

comment:5

Replying to @jhpalmieri:

Does this duplicate #30639? If so, let's close #30639, since there is no branch there.

Yes, removing src/bin/sage-num-threads.py is indeed done as part of this ticket. Thanks for pointing this out.

@tobiasdiez
Copy link
Contributor Author

comment:6

Replying to @jhpalmieri:

If I'm reading the old code right, the default number of threads was 2, but you've changed it to 10. At least one reason to keep the default low was for the case of shared machines, where a single user should only use many cores at a time intentionally.

Yes, you are right. Some of the defaults changed as part of this refactoring since the defaults used where not homogeneous. Personally, I would say that we should try to use a high parallelization as the default, because most systems today have more than 8 cores (especially developer machines). People in a shared environment can still overwrite this of course. But if you think another default is more reasonable, this can be easily changed.

@mkoeppe
Copy link
Contributor

mkoeppe commented Feb 9, 2022

comment:7

ALL_CAPS variables in sage.env should be strings, not integers.
Is it really necessary to introduce THREAD_COUNT as a variable? Having a function that returns this value should be enough.

@mkoeppe
Copy link
Contributor

mkoeppe commented Feb 9, 2022

comment:8

Also, I don't see why the words num_threads (which developers are familiar with) should be replaced by thread_count.

@mkoeppe
Copy link
Contributor

mkoeppe commented Feb 9, 2022

comment:9

Also, the point of build/bin/sage-build-num-threads (called in build/make/install) is that it initializes the number of threads in a build context based on the user's use of make -jNUM.
Please don't delete this.

@mkoeppe
Copy link
Contributor

mkoeppe commented Feb 9, 2022

Replying to @tobiasdiez:

The only exception is the script build/bin/sage-build-num-threads which is only needed in the build of the sage-conf package.

... no, that's not what it does

@jhpalmieri
Copy link
Member

comment:11

Replying to @tobiasdiez:

Replying to @jhpalmieri:

If I'm reading the old code right, the default number of threads was 2, but you've changed it to 10. At least one reason to keep the default low was for the case of shared machines, where a single user should only use many cores at a time intentionally.

Yes, you are right. Some of the defaults changed as part of this refactoring since the defaults used where not homogeneous. Personally, I would say that we should try to use a high parallelization as the default, because most systems today have more than 8 cores (especially developer machines). People in a shared environment can still overwrite this of course.

I'm concerned with mathematicians not knowing that they should change it and so accidentally overloading a system. We should not rely on users knowing what they're doing.

@slel
Copy link
Member

slel commented Feb 9, 2022

comment:12

Replying to @jhpalmieri:

I'm concerned with mathematicians not knowing that they
should change it and so accidentally overloading a system.
We should not rely on users knowing what they're doing.

+1

Users who need speed will figure out how to change
a low default to something higher.

They can set the number of threads to use
for parallel computations in their init.sage.

@tscrim
Copy link
Collaborator

tscrim commented Feb 10, 2022

comment:13

Replying to @slel:

Replying to @jhpalmieri:

I'm concerned with mathematicians not knowing that they
should change it and so accidentally overloading a system.
We should not rely on users knowing what they're doing.

+1

Users who need speed will figure out how to change
a low default to something higher.

They can set the number of threads to use
for parallel computations in their init.sage.

+1 as well. We could very likely upset a sysadmin, especially with such a change where even a more knowledgeable user might not be paying so close attention to the Sage development cycle.

@tobiasdiez
Copy link
Contributor Author

comment:14

Replying to @mkoeppe:

Also, I don't see why the words num_threads (which developers are familiar with) should be replaced by thread_count.

I took the cpu_count function from python as a blueprint, but can change the naming of course if you think num_threads is easier / more common.

@tobiasdiez
Copy link
Contributor Author

comment:15

Replying to @mkoeppe:

Also, the point of build/bin/sage-build-num-threads (called in build/make/install) is that it initializes the number of threads in a build context based on the user's use of make -jNUM.
Please don't delete this.

I didn't delete this script, but only the one in src/bin.

@tobiasdiez
Copy link
Contributor Author

comment:16

Replying to @tscrim:

Replying to @slel:

Replying to @jhpalmieri:

I'm concerned with mathematicians not knowing that they
should change it and so accidentally overloading a system.
We should not rely on users knowing what they're doing.

+1

Users who need speed will figure out how to change
a low default to something higher.

They can set the number of threads to use
for parallel computations in their init.sage.

+1 as well. We could very likely upset a sysadmin, especially with such a change where even a more knowledgeable user might not be paying so close attention to the Sage development cycle.

I think you misunderstood me. I didn't change the user-facing multiprocessing for calculation, i.e. what ncpus in sage/parallel returns (at least if I don't overlook something). This still first returns SAGE_NUM_THREADS if set, and only as a fallback uses the number of cores. What I changed was that other parallel processes follow the same strategy. This change should only impact devs that compile sage and run doctests. I'm not sure if it really makes sense to be more strict there than with user-facing calculations. I would expect that devs have more knowledge that they might need to restrict parallel computations in order to not negatively impact their system.

Anyway, if the special treatment of compilation and doctests is indeed preferred, what would be a good default? No parallelization at all if the user doesn't specify SAGE_NUM_THREADS?

@mkoeppe
Copy link
Contributor

mkoeppe commented Feb 10, 2022

comment:17

Replying to @tobiasdiez:

Replying to @mkoeppe:

Also, the point of build/bin/sage-build-num-threads (called in build/make/install) is that it initializes the number of threads in a build context based on the user's use of make -jNUM.
Please don't delete this.

I didn't delete this script, but only the one in src/bin.

You deleted its use in build/make/install

@mkoeppe
Copy link
Contributor

mkoeppe commented Feb 10, 2022

comment:18

Replying to @tobiasdiez:

Replying to @mkoeppe:

Also, I don't see why the words num_threads (which developers are familiar with) should be replaced by thread_count.

I [...] can change the naming of course

Yes, please change it back.

@mkoeppe mkoeppe modified the milestones: sage-9.6, sage-9.7 Apr 2, 2022
@mkoeppe mkoeppe modified the milestones: sage-9.7, sage-9.8 Sep 19, 2022
@mkoeppe mkoeppe removed this from the sage-9.8 milestone Jan 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants