Publicacoes - INESC TEC

Publicações

Publicações por HumanISE

2014

Trace-Based Reconfigurable Acceleration with Data Cache and External Memory Support

Autores
Paulino, N; Ferreira, JC; Cardoso, JMP;

Publicação
2014 IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING WITH APPLICATIONS (ISPA)

Abstract
This paper presents a binary acceleration approach based on extending a General Purpose Processor (GPP) with a Reconfigurable Processing Unit (RPU), both sharing an external data memory. In this approach repeating sequences of GPP instructions are migrated to the RPU. The RPU resources are selected and organized off-line using execution trace information. The RPU core is composed of Functional Units (FUs) that correspond to single CPU instructions. The FUs are arranged in stages of mutually independent operations. The RPU can enable several stages in tandem, depending on the data dependencies. External data memory accesses are handled by a configurable dual-port cache. A prototype implementation of the architecture on a Spartan-6 FPGA was validated with 12 benchmarks and achieved an overall geometric mean speedup of 1.91x.

FecharLer Abstract

2014

Coarse/Fine-grained Approaches for Pipelining Computing Stages in FPGA-Based Multicore Architectures

Autores
Azarian, A; Cardoso, JMP;

Publicação
EURO-PAR 2014: PARALLEL PROCESSING WORKSHOPS, PT II

Abstract
In recent years, there has been increasing interest on using task-level pipelining to accelerate the overall execution of applications mainly consisting of producer/consumer tasks. This paper presents coarse/fine-grained data flow synchronization approaches to achieve pipelining execution of the producer/consumer tasks in FPGA-based multicore architectures. Our approaches are able to speedup the overall execution of successive, data-dependent tasks, by using multiple cores and specific customization features provided by FPGAs. An important component of our approach is the use of customized inter-stage buffer schemes to communicate data and to synchronize the cores associated to the producer/consumer tasks. The experimental results show the feasibility of the approach when dealing with producer/consumer tasks with out-of-order communication and reveal noticeable performance improvements for a number of benchmarks over a single core implementation and not using task-level pipelining.

FecharLer Abstract

2014

Multi-target c code generation from MATLAB

Autores
Bispo, J; Reis, L; Cardoso, JMP;

Publicação
Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI)

Abstract
This paper describes our recent work on MATISSE, a framework for MATLAB to C compilation. We focus on the new optimizations and transformations, as well as on OpenCL generation. MATISSE is controlled with LARA, an aspect-oriented language, able to specify transformations to the input MATLAB code (e.g., insertion of code for variable initialization and for monitoring) and to express information concerning types and shapes of variables. We evaluate the compiler with a set of benchmarks when targeting both an embedded system and a desktop system. The results show that we were able to achieve a speedup up to 1.8× by employing information provided by LARA aspects. We also compare the execution time of the generated C code with the original code running on MATLAB, and we achieve a geometric mean speedup of 19×. The geometric mean speedup reduces to 12× when optimizing the MATLAB code with LARA aspects. Finally, we present a preliminary version of a fully-functioning pragma-based OpenCL generator, built over the MATISSE framework..

FecharLer Abstract

2014

Practical education fostered by research projects in an embedded systems course

Autores
Bonato, V; Fernandes, MM; Cardoso, JMP; Marques, E;

Publicação
International Journal of Reconfigurable Computing

Abstract
The very nature of universities makes them unique environments for research and teaching. Although both activities constantly borrow from each other, a deeper level of interaction is not always achieved for several reasons. This paper presents a successful experience on conducting an undergraduate course on embedded systems, based on strong interaction with related research activities previously conducted by the authors. Known for being everywhere, embedded systems are constantly expanding in both complexity and volume production. In addition, heterogeneous systems are becoming prevalent in modern applications, standing as an additional difficulty to students in this area. In this context, this paper presents experiences in teaching embedded systems using a project-based learning pedagogical approach, with strong emphasis on mobile robotic applications previously developed by MSc and PhD students. As a result, it has been observed that undergraduate students have the opportunity to build a strong background and feel better prepared to face the challenges to be found in their future professional activities. © 2014 Vanderlei Bonato et al.

FecharLer Abstract

2014

Representation of Evolutionary Algorithms in FPGA Cluster for Project of Large-Scale Networks

Autores
Perina, AndreB.; Gois, MarcilyanneMoreira; Matias, Paulo; Cardoso, JoaoM.P.; Delbem, AlexandreC.B.; Bonato, Vanderlei;

Publicação
CoRR

Abstract

2014

Reconfigurable Computing: Architectures, Tools, and Applications - 10th International Symposium, ARC 2014, Vilamoura, Portugal, April 14-16, 2014. Proceedings

Autores
Goehringer, D; Santambrogio, MD; Cardoso, JMP; Bertels, K;

Publicação
ARC

Abstract