Skip to main content

Rockport

Rockport vs Infiniband Cluster

Half of COSMA7 now has a Rockport Ethernet network fabric, rather than InfiniBand. This means: you can test your code with traditional NVIDIA Infiniband against exactly the same cluster connected with Rockport.

Each COSMA7 node has 28 cores (Intel) and 512GB RAM. Both the cosma7 and cosma7-rp (Rkcport) partitions have 224 nodes.

It is based on a 6D torus topology, rather than the 2:1 blocking fat tree fabric used by the rest of COSMA7. It should offer consistently low latency even when the network is heavily congested, and so workloads that are latency dependent, or suffer when there is congestion should benefit. Each node has 100 GBit/s connectivity (also the case for the InfiniBand nodes).

Usage

To use the Rockport nodes, you should submit jobs to the cosma7-rp partition. e.g. within your batch script, you would use:

SBATCH -p cosma7-rp

You are also advised to load the rockport-settings environment module within your SLURM batch script:

module load rockport-settings

This sets some environment variables which then allow the Rockport fabric to be used most efficiently. To see these, you can use module show rockport-settings. If using Intel MPI, you can simply use mpirun. If using OpenMPI, use

mpirun $RP_OPENMPI_ARGS …

The two recommended MPI libraries to use are (as of June 2022):

module load intel_comp/2022.1.2 compiler mpi 
module load openmpi/4.1.1

To enable multipath on a per-job basis, add to the mpirun commandline (when using UCX), for OpenMPI:

-x UCX_IB_TRAFFIC_CLASS=160

and for Intel MPI (after 2018):

-genv UCX_IB_TRAFFIC_CLASS=160

This allows packets to take multiple routes between pairs of nodes increasing direct bandwidth between these nodes.

Other settings

Rockport recommend some other settings for RDMA transports:

--fwd-mpirun-port
--mca oob_tcp_listen_mode listen_thread

Storage

The Rockport partition does not currently have dedicated (fabric-native) storage. Rather, the /cosma7, /cosma6 etc storage are routed through 4 dedicated nodes. This results in a slight reduction in performance. Therefore, when benchmarking codes, please don’t include any time to read or write to storage.

Not all storage systems are mounted on the Rockport partition – please check before use. e.g. /cosma6 is not currently mounted.