NOTE: these exercises have been tested on MI210 and MI300A accelerators using a container environment.
To see details on the container environment (such as operating system and modules available) please see README.md
on this repo.
Choose one or more of the CUDA samples in HPCTrainingExamples/HIPIFY/mini-nbody/cuda
directory. Manually convert it to HIP. Tip: for example, the cudaMalloc will be called hipMalloc.
You can choose from nbody-block.cu, nbody-orig.cu, nbody-soa.cu
You'll want to compile on the node you've been allocated so that hipcc will choose the correct GPU architecture.
Use the hipify-perl
script to "hipify" the CUDA samples you used to manually convert to HIP in Exercise 1. hipify-perl is in $ROCM_PATH/hip/bin
directory and should be in your path.
First test the conversion to see what will be converted
hipify-perl -examine nbody-orig.cu
You'll see the statistics of HIP APIs that will be generated. The output might be different depending on the ROCm version.
[HIPIFY] info: file 'nbody-orig.cu' statistics:
CONVERTED refs count: 7
TOTAL lines of code: 91
WARNINGS: 0
[HIPIFY] info: CONVERTED refs by names:
cudaFree => hipFree: 1
cudaMalloc => hipMalloc: 1
cudaMemcpyDeviceToHost => hipMemcpyDeviceToHost: 1
cudaMemcpyHostToDevice => hipMemcpyHostToDevice: 1
hipify-perl
is in $ROCM_PATH/hip/bin
directory and should be in your path. In some versions of ROCm, the script is called hipify-perl
.
Now let's actually do the conversion.
hipify-perl nbody-orig.cu > nbody-orig.cpp
Compile the HIP programs.
hipcc -DSHMOO -I ../ nbody-orig.cpp -o nbody-orig
The #define SHMOO
fixes some timer printouts. Add --offload-arch=<gpu_type>
to specify the GPU type and avoid the autodetection issues when running on a single GPU on a node.
- Fix any compiler issues, for example, if there was something that didn't hipify correctly.
- Be on the lookout for hard-coded Nvidia specific things like warp sizes and PTX.
Run the program
./nbody-orig
A batch version of Exercise 2 is:
#!/bin/bash
#SBATCH -N 1
#SBATCH --ntasks=1
#SBATCH --gpus=1
#SBATCH -p LocalQ
#SBATCH -t 00:10:00
pwd
module load rocm
cd HPCTrainingExamples/HIPIFY/mini-nbody/cuda
hipify-perl -print-stats nbody-orig.cu > nbody-orig.cpp
hipcc -DSHMOO -I ../ nbody-orig.cpp -o nbody-orig
./nbody-orig
Notes:
- Hipify tools do not check correctness
hipconvertinplace-perl
is a convenience script that doeshipify-perl -inplace -print-stats
command