Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by Luis Miguel Pinho

2022

Configuration of Parallel Real-Time Applications on Multi-Core Processors

Authors
Gharajeh, MS; Carvalho, T; Pinho, LM;

Publication
2022 IEEE 20TH INTERNATIONAL CONFERENCE ON INDUSTRIAL INFORMATICS (INDIN)

Abstract
Parallel programming models (e.g., OpenMP) are more and more used to improve the performance of real-time applications in modern processors. Nevertheless, these processors have complex architectures, being very difficult to understand their timing behavior. The main challenge with most of existing works is that they apply static timing analysis for simpler models or measurement-based analysis using traditional platforms (e.g., single core) or considering only sequential algorithms. How to provide an efficient configuration for the allocation of the parallel program in the computing units of the processor is still an open challenge. This paper studies the problem of performing timing analysis on complex multi-core platforms, pointing out a methodology to understand the applications' timing behavior, and guide the configuration of the platform. As an example, the paper uses an OpenMP-based program of the Heat benchmark on a NVIDIA Jetson AGX Xavier. The main objectives are to analyze the execution time of OpenMP tasks, specify the best configuration of OpenMP directives, identify critical tasks, and discuss the predictability of the system/application. A Linux perf based measurement tool, which has been extended by our team, is applied to measure each task across multiple executions in terms of total CPU cycles, the number of cache accesses, and the number of cache misses at different cache levels, including L1, L2 and L3. The evaluation process is performed using the measurement of the performance metrics by our tool to study the predictability of the system/application.

2022

Heuristic-based Task-to-Thread Mapping in Multi-Core Processors

Authors
Gharajeh, MS; Royuela, S; Pinho, LM; Carvalho, T; Quinones, E;

Publication
2022 IEEE 27TH INTERNATIONAL CONFERENCE ON EMERGING TECHNOLOGIES AND FACTORY AUTOMATION (ETFA)

Abstract
OpenMP can be used in real-time applications to enhance system performance. However, predictability of OpenMP applications is still a challenge. This paper investigates heuristics for the mapping of OpenMP task graphs in underlying threads, for the development of time-predictable OpenMP programs. These approaches are based on a global scheduling queue, as well as per-thread allocation queues. The proposed method is divided into scheduling and allocation phases. In the former phase, OpenMP task-parts are discovered from OpenMP graph and placed in the scheduling queue. Afterwards, an appropriate allocation queue is selected for each task-part using four heuristic algorithms. In the latter phase, the best task-part is selected from the allocation queue to be allocated to and executed by an idle thread. Preliminary simulation results show that the new method overcomes BFS and WFS in terms of scheduling time and idle time.

2023

A Scalable Clustered Architecture for Cyber-Physical Systems

Authors
Cabral, B; Costa, P; Fonseca, T; Ferreira, LL; Pinho, LM; Ribeiro, P;

Publication
2023 IEEE 21ST INTERNATIONAL CONFERENCE ON INDUSTRIAL INFORMATICS, INDIN

Abstract
Developing distributed and scalable Cyber-Physical Systems (CPS) that can handle large amounts of data at high data rates at the edge, remains a challenging task. Also, the limited availability of open-source solutions makes it difficult for developers and researchers to experiment with and deploy CPSs on a larger scale. This work introduces Edge4CPS, an open-source multi-architecture solution built over Kubernetes that aims to enable an easy to use, efficient and scalable solution for the deployment of applications on edge-like distributed computing clusters. To verify the successful real-world implementation of the introduced architecture, the system was tested in a railway scenario, derived from the Ferrovia 4.0 project, which highlights its functionalities.

2023

Framework for the Analysis and Configuration of Real-Time OpenMP Applications

Authors
Carvalho, T; Pinho, LM; Samadi, M; Royuela, S; Munera, A; Quiñones, E;

Publication
2023 IEEE 21ST INTERNATIONAL CONFERENCE ON INDUSTRIAL INFORMATICS, INDIN

Abstract
High-performance cyber-physical applications impose several requirements with respect to performance, functional correctness and non-functional aspects. Nowadays, the design of these systems usually follows a model-driven approach, where models generate executable applications, usually with an automated approach. As these applications might execute in different parallel environments, their behavior becomes very hard to predict, and making the verification of non-functional requirements complicated. In this regard, it is crucial to analyse and understand the impact that the mapping and scheduling of computation have on the real-time response of the applications. In fact, different strategies in these steps of the parallel orchestration may produce significantly different interference, leading to different timing behaviour. Tuning the application parameters and the system configuration proves to be one of the most fitting solutions. The design space can however be very cumbersome for a developer to test manually all combinations of application and system configurations. This paper presents a methodology and a toolset to profile, analyse, and configure the timing behaviour of highperformance cyber-physical applications and the target platforms. The methodology leverages on the possibility of generating a task dependency graph representing the parallel computation to evaluate, through measurements, different mapping configurations and select the one that minimizes response time.

2013

Critical-Path-First Based Allocation of Real-Time Streaming Applications on 2D Mesh-Type Multi-Cores

Authors
Abdel Aziz Ali, HIAA; Pinho, LM; Akesson, B;

Publication
2013 IEEE 19TH INTERNATIONAL CONFERENCE ON EMBEDDED AND REAL-TIME COMPUTING SYSTEMS AND APPLICATIONS (RTCSA)

Abstract
Designing cost-efficient multi-core real-time systems requires efficient techniques to allocate applications to cores while satisfying their timing constraints. However, existing approaches typically allocate using a First-Fit algorithm, which does not consider the execution time and potential parallelism of paths in the applications, resulting in over-dimensioned systems. This work addresses this problem by proposing a new heuristic algorithm, Critical-Path-First, for the allocation of real-time streaming applications modeled as dataflow graphs on 2D mesh multi-core processors. The main criteria of the algorithm is to allocate paths that have the highest impact on the execution time of the application first. It is also able to exploit parallelism in the application by allocating parallel paths on different cores. Experimental evaluation shows that the proposed heuristic improves the resource utilization by allocating up to 7% more applications and it minimizes the average end-to-end worst-case response time of the allocated applications by up to 31%.

2014

Response-time analysis of synchronous parallel tasks in multiprocessor systems

Authors
Maia, C; Bertogna, M; Nogueira, L; Pinho, LM;

Publication
ACM International Conference Proceeding Series

Abstract
Programmers resort to user-level parallel frameworks in order to exploit the parallelism provided by multiprocessor platforms. While such general frameworks do not support the stringent timing requirements of real-time systems, they offer a useful model of computation based on the standard fork/join, for which the analysis of timing properties makes sense. Very few works analyse the schedulability of synchronous parallel real-time tasks, which is a generalisation of the standard fork/join model. This paper proposes to narrow the gap by presenting a model that analyses the response-time of synchronous parallel real-time tasks. The model under consideration targets tasks with fixed priorities, composed of several segments with an arbitrary number of parallel and independent units of execution. We contribute to the state-of-the-art by analysing the response-time behaviour of synchronous parallel tasks. To accomplish this, we take into account concepts previously proposed in the literature and define new concepts such as carry-out decomposition and sliding window technique in order to compute the worst-case workload in a window of interest. Results show that the proposed approach is significantly better than current approaches, improving the state-of-the-art analysis of parallel real-time tasks. Copyright © 2014 ACM.

  • 10
  • 22