Rename preempt to preemptable

grvlbit · grvlbit · commit 6b3a74285558 · 2024-05-24T15:45:02.000+02:00
diff --git a/docs/general/news.md b/docs/general/news.md
@@ -1,5 +1,59 @@
 # News
 
+22.05.2024:
+
+  - UBELIX went through a major upgrade:
+    
+     - The operating system was upgraded to Rocky Linux 9.3
+     - The supported software stack was updated. Supported toolchain versions are now 2021a through 2023a
+     - The scheduler accounting hierarchy was restructured and simplified
+     - The monitoring system was upgraded to Grafana
+     - The user documentation was refactored
+    
+    **SSH Key Switch**
+    
+    Please be aware that the sshd configuration of UBELIX has changed. Consequently, only ED25519 host keys are supported. You will receive a warning when connecting to UBELIX for the first time after the change. You will need to remove the old host keys from your known hosts:
+    
+     - ssh-keygen -R submit.unibe.ch
+     - ssh-keygen -R submit01.unibe.ch
+     - ssh-keygen -R submit02.unibe.ch
+     - ssh-keygen -R submit03.unibe.ch
+     - ssh-keygen -R submit04.unibe.ch
+    
+    The new ED22519 host key fingerprints are:
+    
+     - submit01.unibe.ch (130.92.250.231) - SHA256:qmMfIbwyosfLUsY8BMCTgj6HjQ3Im6bAdhCWK9nSiDs.
+     - submit02.unibe.ch (130.92.250.232) - SHA256:eRTZGWp2bvlEbl8O1pigcONsFZAVKT+hp+5lSQ8lq/A.
+     - submit03.unibe.ch (130.92.250.233) - SHA256:PUkldwWf86h26PSFHCkEGKsrYlXv668LeSnbHBrMoCQ
+     - submit04.unibe.ch (130.92.250.234) - SHA256:D3cmfXkb40P7W935J2Un8sBUd4Sv2MNLkvz9isJOnu0.
+    
+    **Software Stack**
+    
+    Since the all software modules have changed, we advise recompiling all custom software based on the new toolchains!
+    
+    The supported software was rebuilt in the most recent stable version of the foss/2023a toolchain if available. You can search for packages or modules containing a specific string using "module spider". You can list all currently available packages using "module avail". Beware, this list is very long! It may be more useful to use "module spider" instead.
+    
+    In case you're missing software, please follow these steps:
+    
+     - Check if a newer version of the software is already available (module spider <software>). Please use this version. If this isn't possible you will need to install it yourself. See our documentation for more information.
+     - Check if your tool/version is available in the easybuilders/easybuild-easyconfigs repository for an easyconfig for foss/2023a or intel/2023a toolchains.
+     - Follow the instructions in this documenation to install the software to your personal stack. If the software is useful to a larger group of users, please open a ticket. Note that no software from unsupported toolchains will be centrally installed.
+    
+    **SLURM changes**
+    
+    Slurm associations are no longer set on partitions. This means it is now possible to submit a job to both the epyc2 as well as the bdw partition, e.g. --partition=epyc2,bdw. When no partition is specified in the job script, partition=epyc2,bdw will be the default. The scheduler will then try to start your job as early as possible on either of the two partitions while prioritizing the partition mentioned first.
+    
+    To eliminate confusion the QoS "job_gpu_preempt" has been renamed to "job_gpu_preemptable" to indicate that jobs submitted with this QoS are in fact preemptable by investor jobs.
+    
+    Also, there are no longer personal and workspace accounts. This means you don't have to specify an account when submitting jobs.
+    
+    Finally, the resources that users can use at the same time, given by the CPU core limit per user in the past has been replaced by a maximum CPU hours limit per user. This should improve the overall scheduling performance.
+    
+    
+    **Monitoring**
+    
+    The status web page is now available at https://ubelix.hpc.unibe.ch. Please note that user jobs are no longer displayed on the status page. Use the "squeue --me" command to get a high-level overview of all your active (running and pending) jobs in the cluster.
+
 12.01.2024:
 
 - The user documentation has been streamlined and updated with recent information
diff --git a/docs/slurm/gpus.md b/docs/slurm/gpus.md
@@ -37,11 +37,11 @@ default job QoS:
 ```
 
 
-## QoS `job_gpu_preempt`
+## QoS `job_gpu_preemptable`
 
 For investors we provide the `gpu-invest` investor partitions with a specific
 QoS per investor that guarantees instant access to the purchased resources.
-Nevertheless, to efficiently use all resources, the QoS `job_gpu_preempt` exists
+Nevertheless, to efficiently use all resources, the QoS `job_gpu_preemptable` exists
 in the `gpu` partition. Jobs, submitted with this QoS have access to all GPU
 resources, but  may be interrupted if resources are required for investor jobs.
 Short jobs, and jobs that make use of checkpointing will benefit from these
@@ -51,7 +51,7 @@ Example: Requesting any four RTX3090 from the resource pool in the `gpu`
 partition:
 ```Bash
 #SBATCH --partition=gpu
-#SBATCH --qos=job_gpu_preempt
+#SBATCH --qos=job_gpu_preemptable
 #SBATCH --gres=gpu:rtx3090:4
 ## Use the following option to ensure that the job, if preempted,
 ## won't be re-queued but canceled instead:
diff --git a/docs/slurm/partitions.md b/docs/slurm/partitions.md
@@ -45,6 +45,6 @@ module load Workspace         # use the Workspace account
 sbatch --partition=gpu-invest job.sh
 ```
 
-!!! note "Preempt"
+!!! note "Preemptable"
     The resources dedicated to investors can be used by non-investing users too.
-    A certain amount of CPUs/GPUs are "reserved" in the investor partitions. But if not used, jobs with the QOS `job_gpu_preempt` can run on these resources. But beware that preemptable jobs may be terminated by investor jobs at any time! Therefore use the qos `job_gpu_preempt` only if your job supports checkpointing or restarts.
+    A certain amount of CPUs/GPUs are "reserved" in the investor partitions. But if not used, jobs with the QOS `job_gpu_preemptable` can run on these resources. But beware that preemptable jobs may be terminated by investor jobs at any time! Therefore use the qos `job_gpu_preemptable` only if your job supports checkpointing or restarts.