Skip to content

Commit 71dc8a3

Browse files
committed
Merge branch 'staging' of github.com:uvarc/rc-learning into staging
New material for parallel course, and corrections to python-introduction. Should not affect any other files.
2 parents 66de2b8 + 946a530 commit 71dc8a3

File tree

3 files changed

+30
-19
lines changed

3 files changed

+30
-19
lines changed

content/notes/slurm-from-cli/scripts/hello.slurm

+1
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
#!/bin/bash
22
#SBATCH --nodes=1
33
#SBATCH --ntasks=1
4+
#SBATCH --cpus-per-task=1 # total cores per task
45
#SBATCH --mem=32000 # mb total memory
56
#SBATCH --time=2:00:00
67
#SBATCH --partition=interactive

content/notes/slurm-from-cli/section2.md

+21-10
Original file line numberDiff line numberDiff line change
@@ -52,16 +52,17 @@ The lines starting with `#SBATCH` are the resource requests. They are called "p
5252
```bash
5353
#SBATCH --nodes=1
5454
#SBATCH --ntasks=1
55+
#SBATCH --cpus-per-task=1 # total cores per task
5556
#SBATCH --mem=32000 # mb total memory
5657
#SBATCH --time=2:00:00
5758
#SBATCH --partition=interactive
5859
#SBATCH --account=hpc_training
5960
```
6061
Here we are requesting
61-
* 1 node, 1 task
62+
* 1 node, 1 task, 1 core
6263
* 32GB of memory (measured in MB). Strictly speaking this will be "Gibibyes."
6364
* 2 hours of running time.
64-
* The standard partition (queue). A partition must be specified.
65+
* The interactive partition (queue). A partition must be specified.
6566
* The account (allocation) group `hpc_training`
6667

6768
The next lines set up the environment to run our job.
@@ -81,11 +82,18 @@ We have chosen to name this script `hello.slurm`, but it can have any name.
8182

8283
**Exercise 1**
8384

84-
Download the hello.slurm and hello.py scripts. Transfer them to the cluster by whatever means you wish. Modify the Slurm script to use your own allocation group name.
85+
Using the Open OnDemand Slurm Script Generator, create a slurm script with the following resource requests:
86+
* 1 node, 1 task, 1 core.
87+
* 32GB of memory.
88+
* 2 hours of running time.
89+
* The interactive partition (queue).
90+
* The account (allocation) group `hpc_training`.
91+
92+
Using the displayed text file, compare your slurm script with our example `hello.slurm`. The requested resources should be the same. Once completed, download your slurm script and transfer it to the cluster by whatever means you wish. Also, download `hello.py` and transfer it to the cluster as it will be needed later.
8593

8694
## Common Slurm Directives
8795

88-
The most commonly used Slurm directives are listed in the table below. Many options have two versions, one with a single hyphen `-` followed by one letter, or two hyphens `--` followed by an equals sign `=` and a word. Some commands have no single-letter equivalent.
96+
The most commonly used Slurm directives are listed in the table below. Many options have two versions, one with a single hyphen `-` followed by one letter, or two hyphens `--` followed by a word and an equals sign `=`. Some commands have no single-letter equivalent.
8997

9098
Angle brackets `< >` indicate a value to be specified, and are not typed.
9199

@@ -164,9 +172,13 @@ $ module key bio
164172

165173
The available software is also listed on our [website](https://www.rc.virginia.edu/userinfo/rivanna/software/complete-list/)
166174

167-
**Question:**
175+
**Exercise 2**
176+
177+
Try typing the command `python` in a terminal window. Why was it unable to find the executable? Now, load a module of your choosing that has python. Try the `python` command again. Purge your current modules and try `python` again.
168178

169-
Why does the command `module load R` give an error?
179+
Use ```module spider R``` to show the available R modules and how to load them. Using this information, why does the command ```module load R``` give an error?
180+
181+
Open hello.slurm using any text editor you prefer and add the lines needed to purge existing modules, load a module that provides python, and execute the hello.py script. For reference, check our example hello.slurm.
170182

171183

172184
## Working with Files and Folders
@@ -214,9 +226,8 @@ $pwd
214226
/home/mst3k/shakespeare
215227
```
216228

217-
**Exercise 2**
218-
219-
Use FastX or Open OnDemand or the command line to create a new folder under your scratch directory. Practice changing into and out of it.
229+
**Exercise 3**
220230

221-
Use FastX and Caja to navigate to your `/scratch` directory. To get there, click `Go` in the Caja menu. A textbox will open. Be sure that "search for files" is unchecked. Erase whatever is in the textbox and type `/scratch/mst3k` (substituting your own user ID). Still in FastX, open a terminal (the black box, or in the System Tools menu) and navigate to your new scratch folder.
231+
Use FastX or Open OnDemand or the command line to create a new folder under your scratch directory. Practice changing into and out of it. Move hello.slurm and hello.py into the newly created folder.
222232

233+
Use FastX and Caja to navigate to your /scratch directory. To get there, click Go in the Caja menu. A textbox will open. Be sure that “search for files” is unchecked. Erase whatever is in the textbox and type /scratch/mst3k (substituting your own user ID). Still in FastX, open a terminal (the black box, or in the System Tools menu) and navigate to your new scratch folder.

content/notes/slurm-from-cli/section3.md

+8-9
Original file line numberDiff line numberDiff line change
@@ -11,18 +11,14 @@ menu:
1111
## Running Jobs from Scratch
1212

1313
We recommend that you run your jobs out of your /scratch directory.
14-
* Your personal /scratch/mst3k folder has much more storage space than your home directory.
15-
* /scratch is on a Weka filesystem, a storage system designed specifically for fast access.
16-
* /scratch is connected to the compute nodes with Infiniband, a very fast network connection.
14+
* Your personal /scratch/mst3k folder has much more storage space than your home directory.
15+
* /scratch is on a Weka filesystem, a storage system designed specifically for fast access.
16+
* /scratch is connected to the compute nodes with Infiniband, a very fast network connection.
1717

1818
{{< alert >}}
1919
The scratch system is not permanent storage, and files older than 90 days will be marked for deleting (purging). You should keep copies of your programs and data in more permanent locations such as your home directory, leased storage such as /project or /standard, or on your lab workstation. After your jobs finish, copy the results to permanent storage.
2020
{{< /alert >}}
2121

22-
**Exercise 3**
23-
24-
Move or copy the hello.slurm script and the hello.py script to the new folder you created in your scratch directory in Exercise 2. Submit hello.slurm.
25-
2622
## Submitting a Job
2723

2824
Once we have navigated to the desired working directory in a terminal window, we use the `sbatch` command to submit the job. This assumes that your Slurm script is located in the current working directory.
@@ -38,9 +34,12 @@ We do not make the script executable. The system handles that.
3834
$sbatch myjob.slurm
3935
Submitted batch job 36805
4036
```
41-
4237
Always remember that you submit your **job script** and not your executable or interpreter script.
4338

39+
**Exercise**
40+
41+
From your working directory where hello.slurm is, submit the job.
42+
4443
## Monitoring a Job
4544

4645
Once submitted, we can monitor our jobs.
@@ -107,7 +106,7 @@ Write a Slurm script that requests 30 minutes of time. Submit a job that will ru
107106
```bash
108107
sleep 30m
109108
```
110-
as the command. You won't need to request a specific amount of memory. Submit this script and monitor your job's status. Once it starts, let it run for a few minutes, then cancel it.
109+
as the command. You won't need to request a specific amount of memory. Submit this script and monitor your jobs status in the queue with `squeue` or the Active Jobs tab. Once it starts, get information about your job with `scontrol`, let it run for a minute, then cancel it with `scancel`. Practice with the terminal commands or the OOD GUI. Note that you will need your job’s ID for the last two commands.
111110

112111
{{< spoiler text="Example script" >}}
113112
{{< code-download file="/notes/slurm-from-cli/scripts/slow.slurm" lang="bash" >}}

0 commit comments

Comments
 (0)