The aim of this online course is to introduce participants to ECMWF's Atos supercomputing system and enable them to efficiently exploit its full potential. The course will cover a full description of the system, from fundamental architecture and software modules to advanced optimisation techniques.
The course target audience are users from ECMWF Member States who already have a good level of HPC experience and those planning to use the ECMWF systems in the near future. Users who are planning to use the system in the near future but do not have a user-ID yet, need to acquire one before applying through Member State computing representative.
The course will be delivered online, as a mixture of presentations and practical sessions via Microsoft Teams.
Registration is possible for individual sessions. Some sessions have pre-requisites which are detailed below.
All lectures will be given in English.
Session dates
Monday 13 May - morning session
Introduction to parallel computing
09:00 to 10:00 BST (8:00 to 09:00 UTC) - Lucian Anton, ECMWF
Introduction to parallel computing.
Atos environment
10:00 to 11:00 BST (9:00 to 10:00 UTC) - Xavier Abellan, ECMWF
An introduction to Atos environment at ECMWF, including login, modules, software stack, ecinteractive and file systems.
Monday 13 May - afternoon session
Virtual Computer tour
13:00 to 13:30 BST (12:00 to 12:30 UTC) - Roberto Cuccu, ECMWF
Introduction to Atos HPC architecture
13:30 to 14:15 BST (12:30 to 13:15 UTC) - Martyn Foster, Atos
Storage Architecture and Lustre
14:15 to 15:00 BST (13:15 to 14:00 UTC) - Martyn Foster, Atos
An introduction to Atos HPC architecture, including:
- Macro architecture
- AMD Rome
- InfiniBand
Storage Architecture and Lustre:
- Lustre Architecture
- System design
- Filesystems
An overview of the supercomputer architecture at ECMWF, including descriptions of the principal components and how they connect to provide a resilient performant supercomputer.
This is an informative presentation with no prerequisites.
Tuesday 14 May - morning session
SLURM and job scheduling
09:00 to 10:00 BST (08:00 to 09:00 UTC) - Martyn Foster, Atos
Using Lustre effectively
10:00 to 11:00 BST (09:00 to 10:00 UTC) - Martyn Foster, Atos
This session explains:
- How to use SLURM, the resource management system deployed on the supercomputers to run and manage parallel and serial jobs on the HPC systems.
- How to best use the parallel filesystems within applications and what tools can be used to examine and modify IO behaviour
This is an informative session with some practical activity.
It is advised to attend the previous session.
Tuesday 14 May - afternoon session
Compiling and running a parallel programme on the HPC systems
13:00 to 15:00 BST (12:00 to 14:00 UTC) - Martyn Foster, Atos
Presentations including:
- Compilers - Intel
- MPI - OpenMPI
- MPI - IntelMPI
This module goes into depth regarding use of the predominant programming environments on the HPC systems. The Intel compilers and Message Passing Interface libraries.
This is an advanced informative session with some classroom activity.
It is advised that attendees have used the HPC systems before and are conversant in C or Fortran.
Wednesday 15 May - morning session
In depth topics in HPC
09:00 to 11:00 BST (8:00 to 10:00 UTC) - Martyn Foster, Atos
In depth topics in HPC, including:
- Hybrid applications
- Task affinity
- Hugepages
This module goes into depth regarding use of some techniques for ensuring good performance on the Atos HPC systems.
This is an informative session with some practical activities.
It is advised that attendees have used the HPC systems before and are conversant in C or Fortran.
If attendees wish to have their own applications available for testing the techniques on, they are encouraged to do so.
Wednesday 15 May - afternoon session
Profiling I - CPU
13:00 to 15:00 BST (12:00 to 14:00 UTC) - Martyn Foster, Atos
This session describes techniques and tools available for evaluating compute performance on the HPC systems. It is a mixture of presentations and hands on sessions.
It is advised that attendees have used the HPC systems before and are conversant in C or Fortran.
If attendees wish to have their own applications available for testing the techniques on, they are encouraged to do so.
Thursday 16 May - morning session
Profiling II - IO and MPI
09:00 to 11:00 BST (08:00 to 10:00 UTC) - Alex Woods, Atos
This session describes techniques and tools available for evaluating Application MPI and IO related performance and characteristics on the HPC systems. It is a mixture of presentations and hands on sessions.
It is advised that attendees have used the HPC systems before and are conversant in C or Fortran.
If attendees wish to have their own applications available for testing the techniques on, they are encouraged to do so.
Thursday 16 May - afternoon session
Debugging
14:00 to 15:00 BST (13:00 to 14:00 UTC) - Alex Woods, Atos
This session describes techniques and tools available for fault finding “debugging” parallel applications on the HPC systems. It will comprise a mixture of presentations and hands on material.
It is advised that attendees have used the HPC systems before and are conversant in C or Fortran.