Rockport
Rockport vs Infiniband Cluster
Half of COSMA7 now has a Rockport Ethernet network fabric, rather than InfiniBand. This means: you can test your code with traditional NVIDIA Infiniband against exactly the same cluster connected with Rockport.
Each COSMA7 node has 28 cores (Intel) and 512GB RAM. Both the cosma7 and cosma7-rp (Rkcport) partitions have 224 nodes.
It is based on a 6D torus topology, rather than the 2:1 blocking fat tree fabric used by the rest of COSMA7. It should offer consistently low latency even when the network is heavily congested, and so workloads that are latency dependent, or suffer when there is congestion should benefit. Each node has 100 GBit/s connectivity (also the case for the InfiniBand nodes).
Usage
To use the Rockport nodes, you should submit jobs to the cosma7-rp partition. e.g. within your batch script, you would use:
SBATCH -p cosma7-rp
You are also advised to load the rockport-settings environment module within your SLURM batch script:
module load rockport-settings
This sets some environment variables which then allow the Rockport fabric to be used most efficiently. To see these, you can use module show rockport-settings. If using Intel MPI, you can simply use mpirun. If using OpenMPI, use
mpirun $RP_OPENMPI_ARGS …
The two recommended MPI libraries to use are (as of June 2022):
module load intel_comp/2022.1.2 compiler mpi
module load openmpi/4.1.1
To enable multipath on a per-job basis, add to the mpirun commandline (when using UCX), for OpenMPI:
-x UCX_IB_TRAFFIC_CLASS=160
and for Intel MPI (after 2018):
-genv UCX_IB_TRAFFIC_CLASS=160
This allows packets to take multiple routes between pairs of nodes increasing direct bandwidth between these nodes.
Other settings
Rockport recommend some other settings for RDMA transports:
--fwd-mpirun-port
--mca oob_tcp_listen_mode listen_thread
Storage
The Rockport partition does not currently have dedicated (fabric-native) storage. Rather, the /cosma7, /cosma6 etc storage are routed through 4 dedicated nodes. This results in a slight reduction in performance. Therefore, when benchmarking codes, please don’t include any time to read or write to storage.
Not all storage systems are mounted on the Rockport partition – please check before use. e.g. /cosma6 is not currently mounted.