2015
Authors
Azarian, A; Cardoso, JMP;
Publication
2015 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS)
Abstract
Recently, researchers have shown an increased interest in using task-level pipelining to accelerate the overall execution of applications mainly consisting of producer-consumer tasks. This paper proposes optimization techniques for enhancing our approach to pipeline the execution of producer-consumer tasks in FPGA-based multicore architectures with reductions in the number of accesses to external memory. Our approach is able to speedup the overall execution of successive, data-dependent tasks, by using multiple cores and specific customization features provided by FPGAs. We evaluate the impact in the performance of task-level pipelining when using different hash functions and optimization schemes in the inter stage buffer (ISB). The optimizations proposed in this paper were evaluated with FPGA implementations. The experimental results show the efficiency of a simple scheme to reduce external memory accesses and the suitability of the hash function being used. Furthermore, the results reveal noticeable performance improvements for the set of benchmarks being used.
2013
Authors
Paulino, N; Ferreira, JC; Cardoso, JMP;
Publication
RECONFIGURABLE COMPUTING: ARCHITECTURES, TOOLS AND APPLICATIONS
Abstract
This paper presents an extension to a hardware/software system architecture in which repetitive instruction traces, called Megablocks, are accelerated by a Reconfigurable Processing Unit (RPU). This scheme is supported by a custom toolchain able to automatically generate a RPU tailored for the execution of one or more Megablocks detected offline. Switching between hardware and software execution is done transparently, without modifications to source code or executable binaries. Our approach has been evaluated using an architecture with a MicroBlaze General Purpose Processor (GPP) softcore. By using a memory sharing mechanism, the RPU can access the GPP's data memory, allowing the acceleration of Megablocks with load/store operations. For a set of 21 embedded benchmarks, an average speedup of 1.43x is achieved, and a potential speedup of 2.09x is predicted for an implementation using a low overhead interface for communication between GPP and RPU.
2013
Authors
Diniz, PC; Cardoso, JMP; de F. Coutinho, JG; Petrov, Z;
Publication
Compilation and Synthesis for Embedded Reconfigurable Systems
Abstract
2014
Authors
Santos, AC; Cardoso, JMP; Diniz, PC; Ferreira, DR; Petrov, Z;
Publication
JOURNAL OF SUPERCOMPUTING
Abstract
The traditional approach for specifying adaptive behavior in embedded applications requires developers to engage in error-prone programming tasks. This results in long design cycles and in the inherent inability to explore and evaluate a wide variety of alternative adaptation behaviors, critical for systems exposed to dynamic operational and situational environments. In this paper, we introduce a domain-specific language (DSL) for specifying and implementing run-time adaptable application behavior. We illustrate our approach using a real-life stereo navigation application as a case study, highlighting the impact and benefits of dynamically adapting algorithm parameters. The experiments reveal our approach effective, as such run-time adaptations are easily specified in a higher level by the DSL, and thus at a lower programming effort than when using a general-purpose language such as C.
2013
Authors
Coutinho, JGF; Cardoso, JMP; Carvalho, T; Nobre, R; Bhattacharya, S; Diniz, PC; Fitzpatrick, L; Nane, R;
Publication
RECONFIGURABLE COMPUTING: ARCHITECTURES, TOOLS AND APPLICATIONS
Abstract
In the context of the REFLECT project[1] we have developed an aspect-oriented compilation and synthesis toolchain that aims at facilitating the mapping of applications described in high-level imperative programming languages, such as C, to heterogeneous and configurable computing systems. More specifically, we have designed an aspect-oriented domain-specific language, called LARA[2], that allows programmers to convey application-specific and domain-specific knowledge as a way to capture non-functional concerns. The LARA specifications and the subsequent control of the tools via a code weaver allows a seamless exploration of alternative designs and run-time adaptive strategies, in effect enabling designspace exploration (DSE). © 2013 Springer-Verlag.
2013
Authors
Cardoso, JMP; Fernandes, JM; Monteiro, MP; Carvalho, T; Nobre, R;
Publication
JOURNAL OF SYSTEMS ARCHITECTURE
Abstract
This article presents an approach to enrich the MATLAB(1) language with aspect-oriented modularity features, enabling developers to experiment different implementation characteristics and to acquire runtime data and traces without polluting their base MATLAB code. We propose a language through which programmers configure the low-level data representation of variables and expressions. Examples include specifically-tailored fixed-point data representations leading to more efficient support for the underlying hardware, e.g., digital signal processors and application-specific architectures, without built-in floating point units. This approach assists developers in adding handlers and monitoring features in a non-invasive way as well as configuring MATLAB functions with optimized implementations. Different aspect modules can be used to retarget common MATLAB code bases for different purposes and implementations. We validate the proposed approach with a set of representative examples where we attain a simple way to explore a number of properties. Experiment results and collected aspect-oriented software metrics lend support to the claims on its usefulness.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.