Storage Systems
Work description
Design of a solution for improving the performance and balancing the load of checkpointing requests done by AI applications deployed at distributed HPC infrastructures. Development and experimental evaluation of the prototype. The tasks described in this working plan demand the application and development of concepts and techniques in the area of Software Engineering which are usually introduced in curricular units included in the curricula of the Integrated MSc in Informatics Engineering or the MSc in Informatics Engineering studies.
Academic Qualifications
BSc Degree in Computer Science or Informatics Engineering.
Minimum profile required
- Solid Knowledge on storage systems.- Knowledge of Deep Learning and/or Large Language Models workflows.- Experience in using PyTorch and/or TensorFlow.- Knowledge and experience in AI applications checkpointing.
Preference factors
- Experience with C/C++ programming language.
Application Period
Since 03 Apr 2025 to 16 Apr 2025
Centre
High-Assurance Software