Skip to content

Commit 9973f19

Browse files
authored
Merge pull request #102 from uvarc/staging
Making sure it's up to date by Monday
2 parents 2b4f01a + 62fd714 commit 9973f19

File tree

83 files changed

+977
-336
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

83 files changed

+977
-336
lines changed

content/courses/cpp-introduction/setting_up.md

+5-4
Original file line numberDiff line numberDiff line change
@@ -61,13 +61,14 @@ Recently, Microsoft has released the Windows Subsystem for Linux ([WSL](https://
6161
A drawback to both Cygwin and the WSL is portability of executables. Cygwin executables must be able to find the Cygwin DLL and are therefore not standalone.
6262
WSL executables only run on the WSL. For standalone, native binaries a good choice is _MingGW_. MinGW is derived from Cygwin.
6363

64-
MinGW provides a free distribution of gcc/g++/gfortran. The standard MinGW distribution is updated fairly rarely and generates only 32-bit executables. We will describe [MinGW-w64](http://mingw-w64.org/doku.php), a fork of the original project.
64+
MinGW provides a free distribution of gcc/g++/gfortran. The standard MinGW distribution is updated fairly rarely and generates only 32-bit executables. We will describe [MinGW-w64](https://www.mingw-w64.org/), a fork of the original project.
6565
{{< figure src="/courses/cpp-introduction/img/MinGW1.png" width=500px >}}
6666

67-
MinGW-w64 can be installed beginning from the [MSYS2](https://www.msys2.org/) project. MSYS2 provides a significant subset of the Cygwin tools.
68-
Download and install it.
67+
MinGW-w64 can be installed beginning from the [MSYS2](https://www.msys2.org/) project. MSYS2 provides a significant subset of the Cygwin tools. Download and install it.
6968
{{< figure src="/courses/cpp-introduction/img/MSYS2.png" width=500px >}}
70-
Once it has been installed, follow the [instructions](https://www.msys2.org/) to open a command-line tool, update the distribution, then install the compilers and tools.
69+
Once it has been installed, follow the [instructions](https://www.msys2.org/) to open a command-line tool, update the distribution, then install the compilers and tools.
70+
71+
A discussion of installing MinGW-64 compilers for use with VSCode has been posted by Microsoft [here](https://code.visualstudio.com/docs/cpp/config-mingw).
7172

7273
_Intel oneAPI_
7374
First install [Visual Studio](https://visualstudio.microsoft.com/vs/community/).

content/courses/fortran-introduction/setting_up.md

+5-4
Original file line numberDiff line numberDiff line change
@@ -54,13 +54,14 @@ Recently, Microsoft has released the Windows Subsystem for Linux ([WSL](https://
5454
A drawback to both Cygwin and the WSL is portability of executables. Cygwin executables must be able to find the Cygwin DLL and are therefore not standalone.
5555
WSL executables only run on the WSL. For standalone, native binaries a good choice is _MingGW_. MinGW is derived from Cygwin.
5656

57-
MinGW provides a free distribution of gcc/g++/gfortran. The standard MinGW distribution is updated fairly rarely and generates only 32-bit executables. We will describe [MinGW-w64](http://mingw-w64.org/doku.php), a fork of the original project.
57+
MinGW provides a free distribution of gcc/g++/gfortran. The standard MinGW distribution is updated fairly rarely and generates only 32-bit executables. We will describe [MinGW-w64](https://www.mingw-w64.org/), a fork of the original project.
5858
{{< figure src="/courses/fortran-introduction/img/MinGW1.png" width=500px >}}
5959

60-
MinGW-w64 can be installed beginning from the [MSYS2](https://www.msys2.org/) project. MSYS2 provides a significant subset of the Cygwin tools.
61-
Download and install it.
60+
MinGW-w64 can be installed beginning from the [MSYS2](https://www.msys2.org/) project. MSYS2 provides a significant subset of the Cygwin tools. Download and install it.
6261
{{< figure src="/courses/fortran-introduction/img/MSYS2.png" width=500px >}}
63-
Once it has been installed, follow the [instructions](https://www.msys2.org/) to open a command-line tool, update the distribution, then install the compilers and tools.
62+
Once it has been installed, follow the [instructions](https://www.msys2.org/) to open a command-line tool, update the distribution, then install the compilers and tools. For Fortran users, the `mingw64` repository may be preferable to the `ucrt64` repo. To find packages, visit their [repository](https://packages.msys2.org/package/).
63+
64+
A discussion of installing MinGW-64 compilers for use with VSCode has been posted by Microsoft [here](https://code.visualstudio.com/docs/cpp/config-mingw). To use mingw64 rather than ucrt64, simply substitute the text string. Fortran users should install both the C/C++ and Fortran extensions for VSCode.
6465

6566
_Intel oneAPI_
6667
Download and install the basic toolkit and, for Fortran, the HPC toolkit.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,107 @@
1+
import sys
2+
import numpy as np
3+
from mpi4py import MPI
4+
5+
comm = MPI.COMM_WORLD
6+
rank = comm.Get_rank()
7+
nprocs = comm.Get_size()
8+
9+
N = 400
10+
M = 600
11+
12+
#This example exchanges data among four rectangular domains with halos.
13+
#Most real codes use squares, but we want to illustrate how to use different
14+
#dimensions.
15+
16+
#Divide up the processes. Either we require a perfect square, or we
17+
#must specify how to distribute by row/column. In a realistic program,
18+
#the process distribution (either the total, for a perfect square, or
19+
#the rows/columns) would be read in and we would need to check that the number
20+
#of processes requested is consistent with the decomposition.
21+
22+
nproc_rows=2
23+
nproc_cols=3
24+
25+
if nproc_rows*nproc_cols != nprocs:
26+
print("Number of rows times columns does not equal nprocs")
27+
sys.exit()
28+
29+
#Strong scaling
30+
if N%nproc_rows==0 and M%nproc_cols==0:
31+
nrl = N//nproc_rows
32+
ncl = M//nproc_cols
33+
else:
34+
print("Number of ranks should divide the number of rows evenly.")
35+
sys.exit()
36+
37+
w = np.zeros((nrl+2, ncl+2), dtype=np.double)
38+
39+
#Set up the topology assuming processes numbered left to right by row
40+
41+
print("Layout ",nproc_rows,nproc_cols)
42+
43+
my_row=rank%nproc_cols
44+
my_col=rank%nproc_rows
45+
46+
print("Topology ",rank,my_row,my_col)
47+
48+
#Set up boundary conditions
49+
if my_row == 0:
50+
w[0,:] = 0. # up
51+
52+
if my_row == nproc_rows-1 :
53+
w[nrl+1,:] = 100. # bottom
54+
55+
if my_col == 0:
56+
w[:,0] = 100. # left
57+
58+
if my_col == nproc_cols-1:
59+
w[:,ncl+1] = 100. # right
60+
61+
#Arbitrary value for interior that may speed up convergence somewhat.
62+
#Be sure not to overwrite boundaries.
63+
w[1:nrl+1,1:ncl+1] = 50.
64+
65+
# setting up the up and down rank for each process
66+
if my_row == 0 :
67+
up = MPI.PROC_NULL
68+
else :
69+
up = rank - nproc_cols
70+
71+
if my_row == nprocs - 1 :
72+
down = MPI.PROC_NULL
73+
else :
74+
down = rank + nproc_cols
75+
76+
if my_col == 0 :
77+
left = MPI.PROC_NULL
78+
else:
79+
left = rank-1
80+
81+
if my_col == ncl-1:
82+
right = MPI.PROC_NULL
83+
else:
84+
right = rank+1
85+
86+
print("Upsie downsie ",rank,my_row,my_col,up,down)
87+
88+
# set up MPI vector type for column
89+
column=MPI.DOUBLE.Create_vector(nrl,1,ncl)
90+
column.Commit()
91+
92+
tag=0
93+
94+
# sending up and receiving down
95+
comm.Sendrecv([w[1,1:ncl+1],MPI.DOUBLE], up, tag, [w[nrl+1,1:ncl+1],MPI.DOUBLE], down, tag)
96+
# sending down and receiving up
97+
comm.Sendrecv([w[nrl,1:ncl+1],MPI.DOUBLE], down, tag, [w[0,1:ncl+1],MPI.DOUBLE], up, tag)
98+
99+
# sending right and left.
100+
comm.Sendrecv((w[0,ncl+1:ncl+2],1,column), right, tag, (w[0:,1],MPI.DOUBLE), left, tag)
101+
102+
comm.Sendrecv((w[0,0:1],1,column), left, tag, (w[0:,ncl],1,MPI.DOUBLE), left, tag)
103+
104+
# Spot-check result
105+
for n in range(nprocs):
106+
if n==rank:
107+
print(n,w[0,ncl//2],w[nrl+1,ncl//2])

content/courses/parallel-computing-introduction/distributed_mpi_global1.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
title: "Global Communication in MPI: One to Many"
2+
title: "Collective Communication in MPI: One to Many"
33
toc: true
44
type: docs
55
weight: 50

content/courses/parallel-computing-introduction/distributed_mpi_global2.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
title: "Global Communication in MPI: Many To One"
2+
title: "Collective Communication in MPI: Many To One"
33
toc: true
44
type: docs
55
weight: 52

content/courses/parallel-computing-introduction/distributed_mpi_global3.md

+25-5
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ In many-to-many collective communications, all processes in the communicator gro
1212

1313
## Barrier
1414

15-
When `MPI_Barrier` is invoked, each process pauses until all processes in the communicator group have called this function. The `MPI_BARRIER` is used to synchronize processes. It should be used sparingly, since it "serializes" a parallel program. Most of the global communication routines contain an implicit barrier so an explicit `MPI_Barrier` is not required.
15+
When `MPI_Barrier` is invoked, each process pauses until all processes in the communicator group have called this function. The `MPI_BARRIER` is used to synchronize processes. It should be used sparingly, since it "serializes" a parallel program. Most of the collective communication routines contain an implicit barrier so an explicit `MPI_Barrier` is not required.
1616

1717
### C++
1818
```c++
@@ -65,11 +65,11 @@ As the examples in the previous chapter demonstrated, when MPI_Reduce is called,
6565
The syntax for `MPI_Allreduce` is identical to that of `MPI_Reduce` but with the root number omitted.
6666

6767
```c
68-
int MPI_Allreduce(void *operand, void *result, int count, MPI_Datatype type, MPI_Op operator, MPI_Comm comm );
68+
int MPI_Allreduce(void *operand, void *result, int ncount, MPI_Datatype type, MPI_Op operator, MPI_Comm comm );
6969
```
7070
7171
```fortran
72-
call MPI_ALLREDUCE(sendbuf, recvbuf, count, datatype, op, comm, ierr)
72+
call MPI_ALLREDUCE(sendbuf, recvbuf, ncount, datatype, op, comm, ierr)
7373
```
7474

7575
```python
@@ -137,7 +137,7 @@ Modify the example gather code in your language of choice to perform an Allgathe
137137

138138
In MPI_Alltoall, each process sends data to every other process. Let us consider the simplest case, when each process sends one item to every other process. Suppose there are three processes and rank 0 has an array containing the values \[0,1,2\], rank 1 has \[10,11,12\], and rank 2 has \[20,21,22\]. Rank 0 keeps (or sends to itself) the 0 value, sends 1 to rank 1, and 2 to rank 2. Rank 1 sends 10 to rank 0, keeps 11, and sends 12 to rank 2. Rank 2 sends 20 to rank 0, 21 to rank 1, and keeps 22.
139139

140-
distributed_mpi_global2.md:{{< figure src="/courses/parallel-computing-introduction/img/alltoall.png" caption="Alltoall. Note that as depicted, the values in the columns are transposed to values as rows." >}}
140+
{{< figure src="/courses/parallel-computing-introduction/img/alltoall.png" caption="Alltoall. Note that as depicted, the values in the columns are transposed to values as rows." >}}
141141

142142
### C++
143143
{{< spoiler text="alltoall.cxx" >}}
@@ -158,4 +158,24 @@ Two more general forms of alltoall exist; `MPI_Alltoallv`, which is similar to `
158158

159159
## MPI_IN_PLACE
160160

161-
We often do not need the send buffer once the message has been communicated, and allocating two buffers wastes memory and requires some amount of unneeded communication. Several MPI procedures allow the special receive buffer `MPI_IN_PLACE`. When used, the send buffer variable is overwritten with the transmitted data. The expected send and receive buffers must be the same size for this to be valid.
161+
We often do not need one buffer once the message has been communicated, and allocating two buffers wastes memory and requires some amount of unneeded communication. MPI collective procedures allow the special buffer `MPI_IN_PLACE`. This special value can be used instead of the receive buffer in `Scatter` and `Scatterv`; in the other collective functions it takes the place of the send buffer. The expected send and receive buffers must be the same size for this to be valid. As usual for mpi2py, the Python name of the variable is MPI.IN_PLACE.
162+
163+
**Examples**
164+
165+
```c++
166+
MPI_Scatter(sendbuf, ncount, MPI_Datatype, MPI_IN_PLACE, ncount, MPI_Datatype, root, MPI_COMM_WORLD);
167+
168+
MPI_Reduce(MPI_IN_PLACE, recvbuf, ncount, MPI_Datatype, MPI_Op, root, MPI_COMM_WORLD);
169+
```
170+
171+
```fortran
172+
call MPI_Scatter(vals, ncount, MPI_TYPE, MPI_IN_PLACE, ncount, MPI_TYPE, root, MPI_COMM_WORLD)
173+
174+
call MPI_REDUCE(MPI_IN_PLACE, recvbuf, ncount, MPI_TYPE, MPI_Op, root, MPI_COMM_WORLD, ierr)
175+
```
176+
177+
```python
178+
comm.Scatter([sendvals,MPI.DOUBLE],MPI.IN_PLACE,root=0)
179+
180+
comm.Reduce(sendarr, MPI.IN_PLACE, operation, root=0)
181+
```

content/courses/parallel-computing-introduction/distributed_mpi_types.md

+57-1
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
title: "MPI Types"
2+
title: "MPI Derived Types"
33
toc: true
44
type: docs
55
weight: 220
@@ -9,3 +9,59 @@ menu:
99
---
1010

1111
Modern programming languages provide data structures that may be called "structs," or "classes," or "types." These data structures permit grouping of different quantities under a single variable name.
12+
13+
MPI also provides a general type that enables programmer-defined datatypes. Unlike arrays, which must be adjacent in memory, MPI derived datatypes may consist of elements in noncontiguous locations in memory.
14+
15+
While more general derived MPI datatypes are available, one of the most commonly used is the `MPI_TYPE_VECTOR`. This creates a group of elements of size _blocklength_ separated by a constant interval, called the _stride_, in memory. Examples would be generating a type for columns in a row-major-oriented language, or rows in a column-major-oriented language.
16+
17+
{{< figure src="/courses/parallel-computing-introduction/img/mpi_vector_type.png" caption="Layout in memory for vector type. In this example, the blocklength is 4, the stride is 6, and the count is 3." >}}
18+
19+
C++
20+
```c++
21+
MPI_Datatype newtype;
22+
MPI_Type_vector(ncount, blocklength, stride, oldtype, newtype);
23+
```
24+
Fortran
25+
```fortran
26+
integer newtype
27+
!code
28+
call MPI_TYPE_VECTOR(ncount, blocklength, stride, oldtype, newtype, ierr)
29+
```
30+
For both C++ and Fortran, `ncount`, `blocklength`, and `stride` must be integers. The `oldtype` is a pre-existing type, usually a built-in MPI Type such as MPI_FLOAT or MPI_REAL. For C++ the new type would be declared as an `MPI_Datatype`, unless it corresponds to an existing built-in type. For Fortran `oldtype` would be an integer if not a built-in type. The `newtype` is a name chosen by the programmer.
31+
32+
Python
33+
```python
34+
newtype = oldtype.Create_vector(ncount, blocklength, stride)
35+
```
36+
37+
A derived type must be _committed_ before it can be used.
38+
39+
```c++
40+
MPI_Type_commit(newtype)
41+
```
42+
Fortran
43+
```fortran
44+
call MPI_TYPE_COMMIT(newtype,ierr)
45+
```
46+
Python
47+
```
48+
newtype.Commit()
49+
```
50+
51+
To use our newly committed type in an MPI communication function, we must pass it the starting position of the data to be placed into the type.
52+
53+
C++
54+
```c++
55+
MPI_Send(&a[0][i],1,newtype,i,MPI_COMM_WORLD);
56+
//We need to pass the first element by reference because an array element
57+
//is not a pointer
58+
```
59+
60+
Fortran
61+
```
62+
MPI_Send(a(1)(i),1,newtype,i,MPI_COMM_WORLD,ierr)
63+
```
64+
65+
66+
67+
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
---
2+
title: "MPI Vector Type Example"
3+
toc: true
4+
type: docs
5+
weight: 230
6+
menu:
7+
parallel_programming:
8+
parent: Distributed-Memory Programming
9+
---
10+
11+
Our example will construct an $N \times $M$ array of floating-point numbers. In C++ and Python we will exchange the "halo" columns using the MPI type, and the rows in the usual way. In Fortran we will exchange "halo" rows with MPI type and columns with ordinary Sendrecv.
Loading

content/courses/python-high-performance/_index.md

+7-7
Original file line numberDiff line numberDiff line change
@@ -23,17 +23,17 @@ For this tutorial, it is assumed that you have experience with programming in Py
2323

2424
## Setup
2525

26-
To follow along for the [Serial Optimization](#serial-optimization-strategies) and [Multiprocessing](#multiprocessing) examples, you can execute the code examples on your own computer or on UVA's high-performance computing cluster, Rivanna. Examples described in the last section, [Distributed Parallelization](#distributed-parallelization), are best executed on UVA's high-performance computing platform, Rivanna.
26+
To follow along for the [Serial Optimization](#serial-optimization-strategies) and [Multiprocessing](#multiprocessing) examples, you can execute the code examples on your own computer or on UVA's high-performance computing cluster. Examples described in the last section, [Distributed Parallelization](#distributed-parallelization), are best executed on UVA's high-performance computing platform.
2727

2828
If you are using your local computer, we recommend the Anaconda distribution (<a href="https://www.anaconda.com/distribution/" target="balnk_">download</a>) to run the code examples. Anaconda provides multiple Python versions, an integrated development environment (IDE) with editor and profiler, Jupyter notebooks, and an easy to use package environment manager.
2929

30-
**If you are using Rivanna, follow these steps to verify that your account is active:**
30+
**If you are using UVA HPC, follow these steps to verify that your account is active:**
3131

32-
### Check your Access to Rivanna
32+
### Check your Access to UVA HPC
3333

34-
1. In your web browser, got to <a href="https://rivanna-desktop.hpc.virginia.edu" target="_blank">rivanna-desktop.hpc.virginia.edu</a>. This takes you to our FastX web portal that lets you launch a remote desktop environment on Rivanna. If you are off Grounds, you must be connected through the UVA Anywhere VPN client.
34+
1. In your web browser, go to <a href="https://fastx.hpc.virginia.edu" target="_blank">fastx.hpc.virginia.edu</a>. This takes you to our FastX web portal that lets you launch a remote desktop environment on a frontend. If you are off Grounds, you must be connected through the UVA Anywhere VPN client.
3535

36-
2. Log in with your UVA credentials and start a MATE session. You can find a more detailed description of the Rivanna login procedure <a href="https://www.rc.virginia.edu/userinfo/rivanna/logintools/fastx/" target="_blank">here</a>.
36+
2. Log in with your UVA credentials and start a MATE session. You can find a more detailed description of the FastX login procedure <a href="https://www.rc.virginia.edu/userinfo/rivanna/logintools/fastx/" target="_blank">here</a>.
3737
* **User name:** Your UVA computing id (e.g. mst3k; don't enter your entire email address)
3838
* **Password:** Your UVA Netbadge password
3939

@@ -44,14 +44,14 @@ python -V
4444
```
4545
You will obtain a response like
4646
```
47-
Python 3.6.6
47+
Python 3.11.3
4848
```
4949
Now type
5050
```
5151
spyder &
5252
```
5353

54-
For Jupyterlab you can use [Open OnDemand](https://rivanna-portal.hpc.virginia.edu). Jupyterlab is one of the Interactive Apps. Note that these apps submit a job to the compute nodes. If you are working on quick development and testing and you wish to use the frontend, to run Jupyter or Jupyterlab on the FastX portal you can run
54+
For Jupyterlab you can use [Open OnDemand](https://ood.hpc.virginia.edu). Jupyterlab is one of the Interactive Apps. Note that these apps submit a job to the compute nodes. If you are working on quick development and testing and you wish to use the frontend, to run Jupyter or Jupyterlab on the FastX portal you can run
5555
```
5656
module load anaconda
5757
anaconda-navigator &
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
{
2+
"cells": [],
3+
"metadata": {},
4+
"nbformat": 4,
5+
"nbformat_minor": 5
6+
}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
import dask
2+
from dask.distributed import Client, progress
3+
import time
4+
import random
5+
6+
def inc(x):
7+
time.sleep(random.random())
8+
return x + 1
9+
10+
def dec(x):
11+
time.sleep(random.random())
12+
return x - 1
13+
14+
def add(x, y):
15+
time.sleep(random.random())
16+
return x + y
17+
18+
if __name__=="__main__":
19+
client = Client(threads_per_worker=4, n_workers=1)
20+
21+
zs=[]
22+
for i in range(20):
23+
x = dask.delayed(inc(i))
24+
y = dask.delayed(dec(x))
25+
z = dask.delayed(add(x, y))
26+
z=z.compute()
27+
zs.append(z)
28+
print(zs)
29+
30+
client.close()
31+
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+

0 commit comments

Comments
 (0)