Publicacoes - INESC TEC

Publicações

Publicações por João Bispo

2019

A framework for automatic and parameterizable memoization

Autores
Besnard, L; Pinto, P; Lasri, I; Bispo, J; Rohou, E; Cardoso, JMP;

Publicação
SOFTWAREX

Abstract
Improving execution time and energy efficiency is needed for many applications and usually requires sophisticated code transformations and compiler optimizations. One of the optimization techniques is memoization, which saves the results of computations so that future computations with the same inputs can be avoided. In this article we present a framework that automatically applies memoization techniques to C/C++ applications. The framework is based on automatic code transformations using a source-to-source compiler and on a memoization library. With the framework users can select functions to memoize as long as they obey to certain restrictions imposed by our current memoization library. We show the use of the framework and associated memoization technique and the impact on reducing the execution time and energy consumption of four representative benchmarks. (C) 2019 The Authors. Published by Elsevier B.V.

FecharLer Abstract

2020

Source-to-source compilation targeting OpenMP-based automatic parallelization of C applications

Autores
Arabnejad, H; Bispo, J; Cardoso, JMP; Barbosa, JG;

Publicação
JOURNAL OF SUPERCOMPUTING

Abstract
Directive-driven programming models, such as OpenMP, are one solution for exploring the potential parallelism when targeting multicore architectures. Although these approaches significantly help developers, code parallelization is still a non-trivial and time-consuming process, requiring parallel programming skills. Thus, many efforts have been made toward automatic parallelization of the existing sequential code. This article presents AutoPar-Clava, an OpenMP-based automatic parallelization compiler which: (1) statically detects parallelizable loops in C applications; (2) classifies variables used inside the target loop based on their access pattern; (3) supportsreductionclauses on scalar and array variables whenever it is applicable; and (4) generates a C OpenMP parallel code from the input sequential version. The effectiveness of AutoPar-Clava is evaluated by using the NAS and Polyhedral Benchmark suites and targeting a x86-based computing platform. The achieved results are very promising and compare favorably with closely related auto-parallelization compilers, such as Intel C/C++ Compiler (icc), ROSE, TRACO and CETUS.

FecharLer Abstract

2020

Exploration of FPGA-Based Hardware Designs for QR Decomposition for Solving Stiff ODE Numerical Methods Using the HARP Hybrid Architecture

Autores
de Souza, CAO; Bispo, J; Cardoso, JMP; Diniz, PC; Marques, E;

Publicação
ELECTRONICS

Abstract
In this article, we focus on the acceleration of a chemical reaction simulation that relies on a system of stiff ordinary differential equation (ODEs) targeting heterogeneous computing systems with CPUs and field-programmable gate arrays (FPGAs). Specifically, we target an essential kernel of the coupled chemistry aerosol-tracer transport model to the Brazilian developments on the regional atmospheric modeling system (CCATT-BRAMS). We focus on a linear solve step using the QR factorization based on the modified Gram-Schmidt method as the basis of the ODE solver in this application. We target Intel hardware accelerator research program (HARP) architecture with the OpenCL programming environment for these early experiments. Our design exploration reveals a hardware design that is up to 4 times faster than the original iterative Jacobi method used in this solver. Still, even with hardware support, the overall performance of our QR-based hardware is lower than its original software version.

FecharLer Abstract

2020

Executing ARMv8 Loop Traces on Reconfigurable Accelerator via Binary Translation Framework

Autores
Paulino, N; Ferreira, JC; Bispo, J; Cardoso, JMP;

Publicação
2020 30TH INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE LOGIC AND APPLICATIONS (FPL)

Abstract
Performance and power efficiency in edge and embedded systems can benefit from specialized hardware. To avoid the effort of manual hardware design, we explore the generation of accelerator circuits from binary instruction traces for several Instruction Set Architectures.

FecharLer Abstract

2020

Overviewing the liveness of refactoring for energy efficiency

Autores
Moreira, E; Correia, FF; Bispo, J;

Publicação
Programming'20: 4th International Conference on the Art, Science, and Engineering of Programming, Porto, Portugal, March 23-26, 2020

Abstract
Mobile device users have been growing in the last years but the limited battery life of these devices is considered one of the major issues amongst users and programmers. Therefore, there is a need to guide developers in developing mobile applications in the most energy efficient way. One of the ways to improve this is to provide live feedback about the energy efficiency of a program while it's being programmed. We have analyzed and compared a total of 16 different tools and presented a list of 15 code smells and respective refactorings. From the analyzed tools, Leafactor is the closest to a valid solution to our problem because it's the only energy-aware tool with the highest liveness level. However, in order to be executed the programmer needs to trigger it on the IDE by selecting the file, instead of automatically being executed without the programmer being noticed and refactor his inefficient code. © 2020 Owner/Author.

FecharLer Abstract

2022

Pegasus: Performance Engineering for Software Applications Targeting HPC Systems

Autores
Pinto, P; Bispo, J; Cardoso, J; Barbosa, JG; Gadioli, D; Palermo, G; Martinovic, J; Golasowski, M; Slaninova, K; Cmar, R; Silvano, C;

Publicação
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING

Abstract
Developing and optimizing software applications for high performance and energy efficiency is a very challenging task, even when considering a single target machine. For instance, optimizing for multicore-based computing systems requires in-depth knowledge about programming languages, application programming interfaces (APIs), compilers, performance tuning tools, and computer architecture and organization. Many of the tasks of performance engineering methodologies require manual efforts and the use of different tools not always part of an integrated toolchain. This paper presents Pegasus, a performance engineering approach supported by a framework that consists of a source-to-source compiler, controlled and guided by strategies programmed in a Domain-Specific Language, and an autotuner. Pegasus is a holistic and versatile approach spanning various decision layers composing the software stack, and exploiting the system capabilities and workloads effectively through the use of runtime autotuning. The Pegasus approach helps developers by automating tasks regarding the efficient implementation of software applications in multicore computing systems. These tasks focus on application analysis, profiling, code transformations, and the integration of runtime autotuning. Pegasus allows developers to program their strategies or to automatically apply existing strategies to software applications in order to ensure the compliance of non-functional requirements, such as performance and energy efficiency. We show how to apply Pegasus and demonstrate its applicability and effectiveness in a complex case study, which includes tasks from a smart navigation system.

FecharLer Abstract