Distributed Systems
[Closed]
Work description
- Design of an energy control system for GPUs under deep learning training settings based on a feedback loop control environment; - Prototype implementation and optimization of the previous design; - Experimental evaluation of the developed prototype with different deep learning models and hardware (e.g., processing units, storage devices). The tasks described in this working plan demand the application and development of concepts and techniques in the area of Software Engineering which are usually introduced in curricular units included in the curricula of the Mestrados Integrados em Engenharia Informática or Mestrado em Engenharia Informática studies.
Academic Qualifications
- BSc Degree in Informatics Engineering Sciences.
Minimum profile required
- Solid knowledge with energy monitoring and energy control systems (i.e., Intel RAPL, PowerJoular, EnergAt, NVML, DVFS);- Knowledge on deep learning frameworks and models (i.e., PyTorch, ResNet18, AlexNet, Cifar-10), as well as heterogenous workloads (e.g., cloud-based workloads, supercomputing workloads);- Solid knowledge on operating systems;- Solid knowledge on distributed systems.
Preference factors
- Experience in the design and development on energy monitoring frameworks; - Experience in the design and development on feedback loop-based energy control systems for GPUs; - Experience with the C++ programming language.
Application Period
Since 29 Feb 2024 to 13 Mar 2024
[Closed]
Centre
High-Assurance Software