Slurm Workload Manager


The clusters use Slurm as workload manager. Slurm provides a large set of commands to allocate jobs, report their states, attach to running programs or cancel submissions. This section will demonstrate their usage by example to get you started.

Compute jobs can be submitted and controlled from a central login node (iffslurm) using ssh:


Run the sinfo command to get a list of available queues (which are called partitions in Slurm) and free nodes:

nanofer      up   infinite     10   idle iffcluster[0601-0610]
nanofer      up   infinite      6  alloc iffcluster[0611-0616]
nanofer      up   infinite      1  down* iffcluster0617

By default, only partitions you have access to will be listed (for nanofer members in this case). In this case, 10 nodes are free for job submissions, 6 have been allocated by other nanofer users and one node does not respond.

Next, try to submit a simple MPI job. You can take this modified ring communication example from to test multiple nodes and their interconnects:

#include <mpi.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>


int main(void) {
    int world_rank, world_size, token;
    MPI_Init(NULL, NULL);
    MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
    MPI_Comm_size(MPI_COMM_WORLD, &world_size);

    if (world_rank != 0) {
        MPI_Recv(&token, 1, MPI_INT, world_rank - 1, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
        printf("Process %d received token %d from process %d\n", world_rank, token, world_rank - 1);
    } else {
        token = 0;
    MPI_Send(&token, 1, MPI_INT, (world_rank + 1) % world_size, 0, MPI_COMM_WORLD);
    if (world_rank == 0) {
        MPI_Recv(&token, 1, MPI_INT, world_size - 1, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
        printf("Process %d received token %d from process %d\n", world_rank, token, world_size - 1);

This simple example will span a ring from allocated compute nodes and share an incremented token with the next ring neighbor until it has passed each node once. The sleep command guarantees that the job will run long enough to try further Slurm commands.

Put the source code into a file ring.c, select the Intel compiler

source compiler-select intel

and compile the source code:

icc -o ring ring.c -I${I_MPI_ROOT}/intel64/include -L${I_MPI_ROOT}/intel64/lib/release -lmpi


The compiler-select script can be used to select from different versions of GCC and the Intel compiler. Run without any argument to get a list of all possible choices. Additionally you can add

alias compiler-select='source compiler-select'

to your .bashrc to run compiler-select without a prefixed source.

In order to submit the ring program to the cluster, you need to provide a batch script:

cat <<-EOF >
    #SBATCH -p nanofer --time=5
    srun ./ring

which can then be passed to the sbatch command:

sbatch -N4 --ntasks-per-node=12 -o "ring.out.%j"

The sbatch command submits a new job to the Slurm scheduler and requests 4 nodes for the job (-N4). Each node will run 12 mpi processes (--ntasks-per-node=12). The output will be written to a file ring.out. with the job id as suffix. The batch file must start with a shebang line (#!/bin/sh). The next lines starting with #SBATCH are optional configuration values for the sbatch command. In this case, the job will be limited to 5 minutes runtime on the partition named nanofer (name taken from the sinfo command). Your actual program can then be run by invoking srun (srun ./ring).

If you prefer to run commands interactively, you can allocate resources with salloc. You will be dropped into an interactive command line after your requested nodes (e.g. -N4) have been allocated for you. Here you can invoke srun manually to distribute work on the compute nodes.


You don't need to use mpirun or mpiexec in Slurm job files. srun creates a MPI runtime environment for you implicitly.

The parameters -N and --ntasks-per-node can also be added to the batch file if you would like to hardcode the number of nodes and processes.

After your job has been submitted, you can execute squeue to list your job status:

  186   nanofer ring_sba     nano  R       0:03      4 iffcluster[0601,0603-0605]

The ST describes the current status of your jobs. The most important status codes are:

Code Description
CA Cancelled (by user or administrator)
CD Completed
CG Completing (some nodes are still running)
F Failed
PD Pending (waiting for free resources)
R Running
TO Timeout

More information about Slurm commands can be found on the official website.

Migrating from Torque to Slurm

Slurm has compatibility wrappers for frequently used Torque commands like qdel, qhold, qrls, qstat and qsub and recognizes most #PBS directives in batch scripts. However, the wrappers only support basic functionality and PBS directives do not support more advanced Slurm features like setting node lists. Therefore, it is advisable to translate PBS batch to Slurm batch files and to get used to Slurm commands. Many Torque commands and directives have equivalent Slurm counterparts which are compared in this section.

The information of this section is taken from the official Slurm comparison sheet.

User Commands

Description Torque Slurm
Job submission qsub [script_file] sbatch [script_file]
Job deletion qdel [job_id] scancel [job_id]
Job status (by job) qstat [job_id] squeue [job_id]
Job status (by user) qstat -u [user_name] squeue -u [user_name]
Job hold qhold [job_id] scontrol hold [job_id]
Job release qrls [job_id] scontrol release [job_id]
Queue list qstat -Q squeue
Node list pbsnodes -l sinfo -N OR scontrol show nodes
Cluster status qstat -a sinfo
GUI xpbsmon sview

Batch scripts

Description Torque Slurm
Script directive #PBS #SBATCH
Queue -q [queue] -p [queue]
Node Count -l nodes=[count] -N [min[-max]]
CPU Count -l ppn=[count] OR -l mppwidth=[PE_count] -n [count]
Wall Clock Limit -l walltime=[hh:mm:ss] -t [min] OR -t [days-hh:mm:ss]
Standard Output File -o [file_name] -o [file_name]
Standard Error File -e [file_name] -e [file_name]
Combine stdout/err -j oe (both to stdout) OR -j eo (both to stderr) (use -o without -e)
Copy Environment -V --export=[ALL / NONE / variables]
Event Notification -m abe --mail-type=[events]
Email Address -M [address] --mail-user=[address]
Job Name -N [name] --job-name=[name]
Job Restart -r [y/n] --requeue OR --no-requeue
Working Directory   --workdir=[dir_name]
Resource Sharing -l naccesspolicy=singlejob --exclusive OR --shared
Memory Size -l mem=[MB] --mem=[mem][M/G/T] OR --mem-per-cpu=[mem][M/G/T]
Account to charge -W group_list=[account] --account=[account]
Tasks Per Node -l mppnppn [PEs_per_node] --tasks-per-node=[count]
CPUs Per Task   --cpus-per-task=[count]
Job Dependency -d [job_id] --depend=[state:job_id]
Job Project   --wckey=[name]
Job host preference   --nodelist=[nodes] AND/OR --exclude=[nodes]
Quality Of Service -l qos=[name] --qos=[name]
Job Arrays -t [array_spec] --array=[array_spec]
Generic Resources -l other=[resource_spec] --gres=[resource_spec]
Licenses   --licenses=[license_spec]
Begin Time -A "YYYY-MM-DD HH:MM:SS" --begin=YYYY-MM-DD[THH:MM[:SS]]

Environment variables

Description Torque Slurm