Technical Information

This topic lists relevant technical information about the cluster hardware.

Nodes

Currently the cluster contains the following GPU nodes:

Count GPU CPU RAM
4 8 x NVidia RTX 5060 Ti
4608 CUDA cores
16 GB RAM
Compute capability 12.0
2 x 14 core Xeon E5-2680 V4 @ 2.4GHz 256GB DDR4 @ 2400MHz
28 8 x NVidia GTX 1080 Ti
3584 CUDA cores
11 GB RAM
Compute capability 6.1
2 x 10 core Xeon E5-2630 V4 @ 2.2GHz 256GB DDR4 @ 2133MHz

For Jupyter notebooks that only require CPUs there are three nodes:

Count CPU RAM
3 2 x 22 core Xeon E5-2699 V4 @ 2.2GHz 512GB RAM @ 2400MHz

Only one GPU and one CPU node is always on. The remaining nodes are turned on when needed for running jobs and shut down again automatically when idle for more than one hour.

The GPU nodes are distributed over four racks and are powered on balanced over the racks. The GPU nodes with NVidia RTX 5060 Ti GPUs are used first and the GPUs always run at 180W power limit. The NVidia GTX 1080 Ti GPUs are running at 250W power limit until more than four nodes per rack are running. Then the power for nodes in a rack is reduced gradually until it is only 125W per GPU when all seven nodes are running. This is required to stay in the power budget for our server racks.

Local Scratch Space

All nodes have 350GB local shared scratch space for running jobs.

Storage

A dedicated Ceph cluster provides all the storage using CephFS:

Count CPU RAM
5 2 x 6 core Xeon Scaleable 3204 @ 1.9 GHz 96 GB RAM @ 2666MHz

The following storage devices are used for providing storage:

Count Media Redundancy Used for
2 12.8 TB Samsung PM1735, PCI-e x8
8000 MB/s read, 3800 MB/s write
1500k IOPS read, 250k IOPS write
3 /home and /cluster
3 12.8 TB KIOXIA CD8-V, PCI-e x4
6,600 MB/s read, 6,000 MB/s write
1050k IOPS read, 380k IOPS write
3 /home and /cluster
5 7.68 TB Samsung PM893, SATA
550 MB/s read, 520 MB/s write
98k IOPS read, 30k IOPS write
2 /work

Triple redundancy is used for improved resilience and read speed for /cluster and /home, resulting in 10.6 TB effective storage capacity. Performance is currently limited by the network and allows concurrent reads with 6 GB/s from the nodes and writes with 3 GB/s.

Dual redundancy is used for /work which results in acceptable resilience and 16 TB effective storage capacity. Performance is limited by the SATA bus and allows concurrent reads with 2.5 GB/s from the nodes and writes with 1.3 GB/s.

Login Nodes

Two login nodes are available for students to prepare and start jobs:

  • student-cluster1.inf.ethz.ch and student-cluster2.inf.ethz.ch with:
    • 2 x 16 core E5-2697A v4 @ 2.6GHz
    • 512 GB RAM @ 2400MHz

Page URL: https://isg.inf.ethz.ch/bin/view/Main/ServicesClusterComputingStudentClusterTechnicalInformation
2025-09-17
© 2025 Eidgenössische Technische Hochschule Zürich