JMU CS 470 Clusters


The CS 470 clusters are located in the EnGeo building; there are currently two semi-isolated clusters. The newer cluster (and the primary cluster for this course) has the following hardware:

  • 9x Dell PowerEdge R6525 w/ 2x AMD EPYC 7252 (8C, 3.1 Ghz, HT) 64 GB – compute nodes
  • Dell PowerEdge R6525 w/ 2x AMD EPYC 7252 (8C, 3.1 Ghz, HT) 64 GB – login node
  • Dell PowerEdge R730 w/ Xeon E5-2640v3 (8C, 2.6Ghz, HT) 32 GB – NFS server
  • (in above) 8x 1.2TB 10K SAS HDD w/ RAID - storage
  • Dell N3024 Switch 24x1GbE, 2xCombo, 2x10GbE SFP+

The older cluster shares the above NFS server and switch but has the following (less capable) hardware for computation:

  • 16x Dell PowerEdge R430 w/ Xeon E5-2630v3 (8C, 2.4Ghz, HT) 32 GB – compute nodes
  • Dell PowerEdge R430 w/ 2x Xeon E5-2630v3 (8C, 2.4Ghz, HT) 32 GB – login node
Assembly (empty boxes) Racked servers

The system network architecture of the original cluster is as follows:

Network diagram

The newer login and compute nodes are located similarly to the older ones.


All of the newer nodes are running RHEL8 with Slurm 20.11 for job management. Environment modules are available for OpenMPI (as well as MPICH on the older cluster). Run module avail to see all available modules, and you can find additional software available in the /shared folder. In particular, you will find several useful utilities in /shared/cs470/bin, and I recommend either adding that folder to your PATH environment variable or making symlinks to a folder that is.

Several command-line text editors are installed by default, including nano, vim, and emacs.

If you need software that is not already installed or available via module, it is recommended that you build it from source in your home directory. Check the documentation for the software for instructions on how to do that. If you run into issues or your software is not available in source form, please email the system admin Pete Morris (morrispj) or the faculty contact Mike Lam (lam2mo) to request assistance.

On-campus Access

The login node of the newer cluster is accessible via SSH as from the campus network (use to access the older cluster if you have an account on it).

It is recommended that you set up public/private key SSH access from your most frequent point of access machines (e.g., your personal laptop). To do this, first generate a public/private keypair from a terminal if you have never done so on that machine:

ssh-keygen -t rsa

If prompted, accept the default location and passphrase options by pressing enter twice. Then, copy the public key to the login node using one of the following commands based on your machine's operating system (run in a terminal open in your home folder):

(on Linux or macOS)
  ssh-copy-id <eid>

(on Windows)
  type .ssh\ | ssh <eid> "cat >> .ssh/authorized_keys"

Now you won't need to enter your password every time you log in from that machine. Here is a slightly longer tutorial if you'd like to learn a bit more about this process.

It is also recommended that you edit your ~/.ssh/config file to add an SSH alias. Here is an example entry:

Host cluster
    User <eid>

Now you can log into the cluster from your machine simply by typing this command:

ssh cluster

Off-campus Access

If you are off-campus, you will need to proxy your SSH connection through an on-campus point of access (for CS students, this will probably be stu). To transparently proxy ssh sessions through stu, you can use the "-J" option if it is available:

ssh -J <eid> <eid>

Obviously, it is also recommended that you set up your ~/.ssh/config on your home machine so that you don't have to type all that every time. Assuming your SSH client supports it, you can even do this transparently by adding "ProxyJump stu" to the ~/.ssh/config configuration for the cluster host. Here is an example full SSH config file:

Host stu
    User <eid>

Host cluster
    User <eid>
    ProxyJump stu

Properly configured, you should be able to log into the cluster from off-campus very easily and without having to enter your password with the following command in a terminal:

ssh cluster

In addition, with all of the above properly configured the VS Code Remote SSH extension should allow you to write programs for CS 470 using a graphical IDE on your personal computer (of course you can always just use a command-line text editor on the login node as well -- see the Software section above).

If the "-J" option is not present in your version of ssh, you can add the following to your ~/.ssh/config file:

Host *.oc
   ProxyCommand ssh -W $(echo %h | sed 's/\.oc$//'):%p 2> /dev/null

Now you can log into the cluster from off campus using the following command (and similar syntax for scp):


For more information on proxies and jump hosts, see this Wikibook page.

If you are on Windows, you can also use PuTTY and WinSCP, both of which can be configured with public/private key access (the keys generated above will need to be converted to a .ppk file first, with WinSCP does automatically) and transparent proxying through stu. Other popular Windows SSH/SCP clients include Bitvise and MobaXterm.

Home Directories

If you are a student in CS 470, you should have an account already on the cluster, with a 250MB disk quota in your home directory (/nfs/home/[eid]). To check your disk usage, use the following command:

quota -s

If you need more space temporarily, use your designated scratch space (/scratch/[eid]). CAUTION: The scratch storage space may be wiped between semesters! If you need more permanent space, please contact your instructor or the cluster admin.

You can connect directly to your cluster home directory or scratch directory from a Linux lab machine:

  1. Open the file manager and select File -> Connect to server.
  2. Enter the following settings:
    Type:     ssh
    Folder:   /scratch/<eid> or /nfs/home/<eid>
    Username: <eid>
    Password: <eid password>

Transferring Files

If you need to transfer files back and forth between the cluster and another Unix-based machine (e.g., running Linux or macOS), you can use the scp command (here is a tutorial). If you are off campus, use the -o option to use stu as your jump host (e.g., -o 'ProxyJump' (and you should also consider adding stu to your SSH config as described above so that you can shorten the host name).

If you would prefer to use a graphical interface, I recommend FileZilla on Linux, CyberDuck on macOS, and WinSCP on Windows.

For a more seamless experience, you can also mount the remote filesystem locally using SSH. If you are doing this from off campus, use the following option to sshfs to jump through stu: -o ssh_command="ssh -J <eid>"

Here is a script that you may find helpful:

Submitting Interactive Jobs

You may use the login node to compile your programs and perform other incidental tasks. YOU SHOULD NOT EXECUTE HEAVY COMPUTATION ON THE LOGIN NODE! To properly run compute jobs, you must submit them using Slurm. You can find various Slurm tutorials on their website.

To run simple jobs interactively, use the srun command:

srun [Slurm options] /path/to/program [program options]

The most important Slurm options are the number of processes/tasks (-n) and the number of allocated nodes (-N). If not specified, the number of nodes will be set to the minimum number necessary to satisfy the process requirement.

The cluster has nine compute nodes, each of which has two eight-core AMD processors. Hyperthreading is enabled on the hardware but disabled in Slurm, so the maximum number of processes per node according to Slurm is sixteen. This minimizes unpredictable performance artifacts due to hyperthreading.

Here are some examples:

srun -n 4  hostname                         # 4  processes (single node)
srun -n 32 hostname                         # 32 processes (requires two nodes)
srun -N 4  hostname                         # 4  processes (4 nodes)
srun -N 4 -n 32 hostname                    # 32 processes (across 4 nodes)

Here are some examples of running an MPI program:

srun -n 1 /shared/cs470/mpi-hello/hello
srun -n 16 /shared/cs470/mpi-hello/hello
srun -n 32 /shared/cs470/mpi-hello/hello

If you'd like to open an interactive shell on a compute node for debugging purposes, you can do so using the following command (switch out "bash" if you prefer a different shell):

srun --pty /usr/bin/bash -i

Submitting Batch Jobs

For longer or more complex jobs, you'll want to run them in batch mode so that you can do other things (or even log out) while your job runs. To run in batch mode, you must prepare a job submission script. This also has the added benefit that you won't have to keep typing long commands. Here is a simple job script:

#SBATCH --job-name=hostname
#SBATCH --nodes=1
#SBATCH --ntasks=1


Assuming the above file has been saved as, it can be submitted using the sbatch command:


The job control system will create the job and tell you the new job ID. The results will be saved to a file titled slurm-[id].out with the corresponding job ID. To see a list of jobs currently submitted or running, use the following command:


The results should look similar to this:

              4267     debug sleep_20   lam2mo PD       0:00      1 (None)
              4266     debug sleep_20   lam2mo  R       0:11      1 compute01

To cancel a job, use the scancel command and give it the ID of the job you wish to cancel:

scancel [id]

Please be considerate--do not run long jobs that require all of the nodes. Check regularly for runaway jobs and cancel them. If you find that someone else has a long-running job that you think may be in error, please email that person directly ( and CC the instructor.

For more information on the use of Slurm, see their online tutorials or read the man pages (e.g., "man sbatch" or "man squeue").

Sample Batch Submit Scripts

Regular or Pthreads application (change NAME and EXENAME):

#SBATCH --job-name=NAME
#SBATCH --nodes=1


OpenMP application (change NAME, NTHREADS, and EXENAME):

#SBATCH --job-name=NAME
#SBATCH --nodes=1


To run with multiple thread counts, you can use a Bash loop. Here is an example for OpenMP:

#SBATCH --job-name=NAME
#SBATCH --nodes=1

for t in 1 2 4 8 16 32; do

MPI application (change NAME, NTASKS, and EXENAME):

#SBATCH --job-name=NAME

module load mpi

If you use zsh instead of bash, you may need to include the following line before running module load mpi:

source /usr/share/Modules/init/zsh

Finally, if you need to submit many batch MPI jobs with different process/task counts, you may find it convenient to parameterize the run script and then actually launch the jobs and view the results with different scripts. Here is an example setup:

#SBATCH –job-name=<cmd>-MPI_NUM_TASKS
#SBATCH --output=<cmd>-MPI_NUM_TASKS.txt

module load mpi
srun -n MPI_NUM_TASKS <cmd>
# (run to submit all jobs)
# TODO: customize for the number of tasks needed for your application
for n in 1 8 16 32 64 128; do
    sed -e "s/MPI_NUM_TASKS/$n/g" | sbatch
# (run to view full or partial results)

# TODO: customize for the number of tasks needed for your application
for n in 1 8 16 32 64 128; do
    echo "== $n processes =="
    cat <cmd>-$n.txt



It is possible to use GDB to debug multithreadjed and MPI applications; however, it is more tricky than serial debugging. The GDB manual contains a section on multithreaded debugging, and there is a short FAQ about debugging MPI applications.


Helgrind is a Valgrind-based tool for detecting synchronization errors in Pthreads applications. To run Helgrind, use the following command:

valgrind --tool=helgrind [your-exe]

To run Helgrind on a compute node, make sure you put srun at the beginning of the command:

srun valgrind --tool=helgrind [your-exe]

For more information about using the tool and interpreting its output, see the manual. Note that your program will run considerably slower with Helgrind because of the added analysis cost.

Performance Analysis

GNU Profiler

To run the GNU profiler, you must compile with the "-pg" command-line switch then run your program as usual. It will create a file called gmon.out in the working directory that contains the raw profiling results. To format the output in human-readable tables, use the gprof utility (note that you must pass it the original executable file for debug information):

gprof <exe-name>

The default output is self-documented; the first table contains flat profiling data and the second table contains profiling data augmented by call graph information. There are also many command-line parameters to control the output; use man gprof to see full documentation.

To see line-by-line information (execution counts only), you can use the gcov utility. To do this, you will also need to compile with the "-fprofile-arcs -ftest-coverage" command-line options and run the program as usual. This will create *.gcda and *.gcdo files containing code coverage results. You can then run gcov on the source code files to produce the final results:

gcov <src-names>

This will produce *.c.gcov files for each original source file with profiling annotations.


You can run Valgrind-based tools without any special compilation flags; in fact, you should NOT include the GNU profiler flags because that will introduce irrelevant perturbation into your Valgrind-based results. To run Valgrind-based tools, simply call the valgrind utility and give it the appropriate tool name:

valgrind --tool=callgrind <exe-name>
valgrind --tool=cachegrind <exe-name>

This will produce callgrind.out.* and cachegrind.out.* files in the working directory containing the raw profiling results. To produce human-readable output, use the callgrind_annotate and cg_annotate utilities:

callgrind_annotate <output-file>
cg_annotate <output-file>

The Cachegrind output can take a little while to decipher if you're unfamiliar with it. Here are the most frequent abbreviations:

I instruction
D data
L1 L1 cache
LL last-level cache (L3 on the cluster)
r read
w write
m miss

For Cachegrind results, you can also obtain line-by-line information by passing the source file as a second parameter to cg_annotate. Note that you may need to specify the full path; check the output of the regular cg_annotate to see what file handle you should use.

For more information about all the reports that these tools can generate, see the Valgrind documentation (specifically, see the sections on Callgrind and Cachegrind).

External Resources