2006
Autores
Cardoso, JMP; Constantinides, GA;
Publicação
INTERNATIONAL JOURNAL OF ELECTRONICS
Abstract
2012
Autores
Bispo, J; Cardoso, JMP; Monteiro, J;
Publicação
25th Symposium on Integrated Circuits and Systems Design, SBCCI 2012, Brasilia, Brazil, August 30 - September 2, 2012
Abstract
Dynamic partitioning is a promising technique where computations are transparently moved from a General Purpose Processor (GPP) to a coprocessor during application execution. To be effective, the mapping of computations to the coprocessor needs to consider aggressive optimizations. One of the mapping optimizations is loop pipelining, a technique extensively studied and known to allow substantial performance improvements. This paper describes a technique for pipelining Megablocks, a type of runtime loop developed for dynamic partitioning. The technique transforms the body of Megablocks into an acyclic dataflow graph which can be fully pipelined and is based on the atomic execution of loop iterations. For a set of 9 benchmarks without memory operations, we generated pipelined hardware versions of the loops and estimate that the presented loop pipelining technique increases the average speedup of non-pipelined coprocessor accelerated designs from 1.6x to 2.2x. For a larger set of 61 benchmarks which include memory operations, the technique achieves a speedup increase from 2.5x to 5.6x. ©2012 IEEE.
2006
Autores
Bertels, K; Cardoso, JMP; Vassiliadis, S;
Publicação
Lecture Notes in Computer Science
Abstract
2002
Autores
Cardoso, JMP; Weinhardt, M;
Publicação
10TH ANNUAL IEEE SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES, PROCEEDINGS
Abstract
2012
Autores
Coutinho, JGF; Bhattacharya, S; Luk, W; Constantinides, GA; Cardoso, JMP; Carvalho, T; Diniz, PC; Petrov, Z;
Publicação
15TH IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ENGINEERING (CSE 2012) / 10TH IEEE/IFIP INTERNATIONAL CONFERENCE ON EMBEDDED AND UBIQUITOUS COMPUTING (EUC 2012)
Abstract
The increasing capability and flexibility of reconfigurable hardware, such as Field-Programmable Gate Arrays (FPGAs), give developers a wide range of architectural choices that can satisfy various non-functional requirements, such as those involving performance, resource and energy efficiency. This paper describes a novel approach, based on an aspect-oriented language called LARA, that enables systematic coding and reuse of optimisation strategies that address such non-functional requirements. Our approach will be presented in three steps. First, this approach is shown to support design space exploration (DSE) which makes use of various compilation and optimisation tools, through the deployment of a master weaver and multiple slave weavers. Second, we present three compilation and synthesis strategies for word-length optimisation based on this approach, which involve three tools: the WLOT word-length optimiser deploying a combination of analytical methods; the AutoESL tool compiling C-based descriptions into hardware; and the ISE tool targeting Xilinx devices. Third, the effectiveness of the approach is evaluated. In addition to promoting design re-use, our approach can be used to automatically produce a range of designs with different trade-offs in resource usage and numerical accuracy according to a given LARA-based strategy. For example, one implementation for a subband filter in an MPEG encoder provides 31% savings in area using non-uniform quantizers when compared to a floating-point description with a similar error specification at the output. Another fixed-point implementation for the gridIterate kernel used by a 3D path planning application consumed 25% less resources when the error specification is increased from 1e-6 to 1e-4.
2012
Autores
Cardoso, JMP; Carvalho, T; Coutinho, JGF; Diniz, PC; Petrov, Z; Luk, W;
Publicação
15th Euromicro Conference on Digital System Design, DSD 2012, Cesme, Izmir, Turkey, September 5-8, 2012
Abstract
The synthesis and mapping of applications to configurable embedded systems is a notoriously hard process. Tools have a wide range of parameters, which interact in very unpredictable ways, thus creating a large and complex design space. When exploring this space, designers must understand the interfaces to the various tools and apply, often manually, a sequence of tool-specific transformations making this an extremely cumbersome and error-prone process. This paper describes the use of aspect-oriented techniques for capturing synthesis strategies for tuning the performance of applications' kernels. We illustrate the use of this approach when designing application-specific architectures generated by a high-level synthesis tool. The results highlight the impact of the various strategies when targeting custom hardware and expose the difficulties in devising these strategies. © 2012 IEEE.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.