Running Jobs
To run a command as a job you can either use
srun
to quickly run it interactively or
sbatch
to spool it for asynchronous execution.
In either case you always need to supply the following two parameters for the job:
- The course or project for which you run the job. You can see the course or project tags that you can use when you log in via
ssh
.
- The maximum runtime you expect this command to run. This can be omitted in which case a default of 60 minutes is used.

One GPU is automatically added to the job for courses and projects that use GPUs.
Using srun
For a simple, short-running command you can use
srun
with the
-A
and
-t
switch. For instance, to get the properties of the GPU at your disposal (and at most spend ten seconds) run:
srun -A {tag} -G 1 -t 00:10 -o nvidia-smi.out nvidia-smi
Running Interactive Jobs
If you need to interact with a program via terminal then you can also use
srun
with the
--pty
argument. To get an interactive bash on a node for at most 60 minutes you can run:
srun --pty -A {tag} -t 60 bash
Using sbatch
For
sbatch
you will put your commands in a script and add comment lines at the beginning that contain additional parameters for
sbatch
. The example from above but using
sbatch
would require a script containing this:
#!/bin/bash
#SBATCH --time=00:10
#SBATCH --account={tag}
#SBATCH --output=nvidia-smi.out
nvidia-smi
To send the script to the cluster for execution run:
GPU Selection
For courses and projects that allow multiple GPUs you can request more than one GPU using the
--gpus
switch, both with
srun
and
sbatch
:
You can also request a specific GPU model for your job using
--gpus
:
Currently the following GPUs are available:
-
1080ti
(NVidia GeForce GTX 1080 Ti, 11GB)
-
5060ti
(NVidia GeForce RTX 5060 Ti, 16GB)
Using Modules
To use the
modules
command in a batch script your first command must be the following (note the dot):
. /etc/profile.d/modules.sh
For example:
#!/bin/bash
#SBATCH --time=00:10
#SBATCH --account={tag}
#SBATCH --output=nvcc.out
. /etc/profile.d/modules.sh
module add cuda/12.1
nvcc --version
Temporary Space and Network Scratch
Each job receives a dedicated temporary directory under
/tmp
(
$TMPDIR
). This directory is located on the local SATA SSD. Data placed there is deleted when the job ends. There are also limitations on
how much space you can use.
You also have a personal network scratch directory under
/work/scratch
that is accessible from all nodes, including the login nodes. Old data is automatically deleted, how soon
depends on how much you put in there.
Checking the Job Queue
To see if you job is still waiting in the queue run:
If you see jobs listed then they are either executing or they are waiting, in case of the later there will be a reason given why the job cannot start yet.

This cluster is maximizing energy efficiency and powers down nodes that are idle. It may first need to start a node to run your job if all others are busy. This can delay the start for five minutes.
Output
Terminal output of commands in the job by default go to a file
slurm-{job ID}.out
in your home directory. The file name for the output can also be set. In the above examples the terminal output of the job will be in the file
nvidia-smi.out
.
Aborting Jobs
If you already know that a running or spooled job is not going to do the right thing, cancel it using
This will save you GPU time.
Tunneling SSH to Nodes with cluster-tunnel
The
cluster-tunnel
utility (a clone of
euler-tunnel) can be used to connect to nodes via SSH from outside. This can be for instance used to run
vscode
or
code-server
directly on nodes.
The instructions are the same as provided for the
Euler cluster) but all occurrences of the string
euler
have to be replaced by
cluster
.