Experiments in implementing the BigHPC Virtual Manager
Title: Experiments in implementing the BigHPC Virtual Manager
Date: September 22, 2022 | 3.00 p.m. (GMT+1)
Speakers: Amit Ruhela, John Cazes, and Stephen Harrell (TACC & UT Austin)
Moderator: Miguel Viana, LIP
Virtual Manager (VM) is a component in the BigHPC implementation that aims to stage and execute application workloads optimally on one of a variety of HPC systems. It mainly consists of two subcomponents, ie. VM scheduler and VM repository.
The Virtual Manager Scheduler provides an interface to submit and monitor application workloads, coordinate the allocation of computing resources on the HPC systems, and optimally execute workloads by matching the workload resource requirements and QoS specified by the user with the available HPC clusters, partitions and QoS reported by the BigHPC Monitoring and Storage Manager components respectively.
Additionally, the Virtual Manager Repository provides a platform to construct and store the software services and applications that support BigHPC workloads as container images. It then provides those uploaded images in a programmatic way when a workload request is submitted to the Virtual Manager Scheduler for execution.
In this talk, we first a few possible approaches to designing Virtual Manager, then we discuss the pros and cons of each approach, and last we discuss the approach which we determined was most feasible and then adopted in the BigHPC implementation.