2011
Authors
Bispo, J; Cardanha Paulino, NM; Cardoso, JMP; Ferreira, JC;
Publication
2011 International Conference on Reconfigurable Computing and FPGAs, ReConFig 2011, Cancun, Mexico, November 30 - December 2, 2011
Abstract
This paper presents an offline tool-chain which automatically extracts loops (Mega blocks) from Micro Blaze instruction traces and creates a tailored Reconfigurable Processing Unit (RPU) for those loops. The system moves loops from the CPU to the RPU transparently, at runtime, and without changing the executable binaries. The system was implemented in an FPGA and for the tested kernels measured speedups ranged between 3.9x and 18.2x for a Micro Blaze CPU without cache. We estimate speedups from 1.03x to 2.01x, when comparing to the best estimated performance achieved with a single Micro Blaze. © 2011 IEEE.
2021
Authors
Campos, HFS; Paulino, N;
Publication
CoRR
Abstract
2021
Authors
Campos, H; Paulino, N;
Publication
CoRR
Abstract
2021
Authors
Silva, PF; Bispo, J; Cardanha Paulino, NM;
Publication
CoRR
Abstract
2016
Authors
Cardanha Paulino, NM;
Publication
Abstract
2023
Authors
Sousa, LM; Bispo, J; Paulino, N;
Publication
2023 32ND INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES, PACT
Abstract
Advancements in semiconductor technology no longer occur at the pace the industry had been accustomed to. We have entered what is considered by many to be the post-Moore era. In order to continue scaling performance, increasingly heterogeneous architectures are being developed and the use of special purpose accelerators is on the rise. One notable example are Field-Programmable-Gate-Arrays (FPGAs), both in the data-center and embedded spaces. Advances in FPGA features and tools is allowing for critical kernels to be accelerated on specialized hardware without fabrication costs. However, re-targeting code to such heterogeneous platforms still requires significant refactoring of the compute intensive kernels, as well as knowledge of parallel compute and hardware design concepts for maximization of performance. We present Tribble, a source-to-source framework under active development, capable of transforming regular C/C++ programs for execution on heterogeneous architectures. This includes transforming the target kernel source code so that it is amenable for circuit generation while keeping the original version for software execution, inserting code for task and memory management and injecting a scheduler algorithm.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.