19th Workshop on high performance computing in meteorology

Autosubmit: An end-to-end workflow manager

Speaker

Wilmer Uruchi (Barcelona Supercomputing Center)

Description

The execution of Weather and Climate simulation models requires orchestrating a series of jobs (steps) that depend on each other. In the context of High-Performance Computing, resources are usually scarce as the users' demand rises as their simulations compete in the scheduling system. An HPC platform usually implements clusters of different capacities and technologies, each with its scheduling system. This feature presents the opportunity to minimize waiting times by executing jobs on other platforms while also keeping track of their execution status. However, as we try to maximize the usage of the available clusters, the complexity of the workflow increases. Autosubmit, a workflow management software system, allows users to configure their simulation experiments to run on multiple platforms. After the user configures the experiment and the configuration is validated, Autosubmit assumes the execution and sends the available jobs to their respective platforms. It keeps track of the jobs' status by querying the scheduling systems of the platforms and stores relevant information for later reporting.

Some scheduling systems present features that Autosubmit can exploit to achieve lower waiting times. For example, Autosubmit can wrap an independent group of mutually dependent jobs to appear as a single large job. Then, this large ball of jobs will spend time in the queue only once and then execute its sequence of jobs. A Graphical User Interface in web technology (Javascript and ReactJS) enhances Autosubmit; we call it Autosubmit GUI. We present dynamic and current representations of the experiment on this web, along with important information from each job. We represent the data collected by Autosubmit so the user can compare past executions of the experiment at a job level of detail. Autosubmit manages the workflow from the HPC level to the front-end graphical representation level.

Primary authors

Wilmer Uruchi (Barcelona Supercomputing Center) Miguel Castrillo (BSC-CNS) Mr Daniel Beltran Mora (Barcelona Supercomputing Center)

Presentation materials