Publications

Publications by João Paiva Cardoso

2010

On identifying and optimizing instruction sequences for dynamic compilation

Authors
Bispo, J; Cardoso, JMP;

Publication
Proceedings of the International Conference on Field-Programmable Technology, FPT 2010, 8-10 December 2010, Tsinghua University, Beijing, China

Abstract
Typical computing systems based on general purpose processors (GPPs) can be extended with coarse-grained reconfigurable arrays (CGRAs) to provide higher performance and/or energy savings. In order for applications to take advantage of these computing systems, possibly including CGRAs varying in size, efficient dynamic compilation/mapping techniques are required. Dynamic mapping will be responsible for automatically moving computations originally running in the GPP to the CGRA. This paper presents our approach to dynamically map computations to CGRAs coupled to a GPP. Specifically, we evaluate the potential of the MegaBlock to accelerate the execution of a number of representative benchmarks when targeting an architecture based on a GPP and a CGRA. In addition, we show the impact on performance when using constant folding and propagation optimizations. © 2010 IEEE.

CloseRead Abstract

2010

A Query Processing Strategy for Conceptual Queries Based on Object-Role Modeling

Authors
Rosado, A; Cardoso, JMP;

Publication
Fourth International Conference on Network and System Security, NSS 2010, Melbourne, Victoria, Australia, September 1-3, 2010

Abstract
There have been several authors asserting that conceptual query languages (CQLs) perform better for querying purposes than logical query languages such as SQL. This paper proposes a query mapping algorithm for the FConQuer system. FConQuer is a framework based on object-role modeling (ORM) schemas, which allow the end-user to formulate conceptual queries through the FConQuer language. Our mapping algorithm allows the FConQuer system to process conceptual queries based on ORM schemas. More precisely, our algorithm maps FConQuer queries to OQL. © 2010 IEEE.

CloseRead Abstract

2010

On Using LALP to Map an Audio Encoder/Decoder on FPGAs

Authors
Menotti, R; Cardoso, JMP; Fernandes, MM; Marques, E;

Publication
IEEE INTERNATIONAL SYMPOSIUM ON INDUSTRIAL ELECTRONICS (ISIE 2010)

Abstract
This paper presents the use of LALP to implement typical industrial application kernels, ADPCM Encoder and Decoder, in FPGAs. LALP is a domain specific language and its compilation framework aims to the direct mapping of algorithms originally described in a high-level language onto FPGAs. In particular, LALP focuses on loop pipelining, a key technique for the design of hardware accelerators. While the language syntax resembles C, it contains certain constructs that allow programmer interventions to enforce or relax data dependences as needed, and so optimize the performance of the generated hardware. We present experimental results showing significant performance gains using this approach, while still keeping the language syntax and semantics close to popular high level software languages, a desirable feature when considering time to market constraints. We believe the performance gains observed for the ADPCM implementation can be extended to other industrial applications relying on algorithms spending most of their execution time on loop structures, such signal and image processing.

CloseRead Abstract

2009

Unbalanced FIFO sorting for FPGA-based systems

Authors
Marcelino, R; Neto, HC; Cardoso, JMP;

Publication
16th IEEE International Conference on Electronics, Circuits, and Systems, ICECS 2009, Yasmine Hammamet, Tunesia, 13-19 December, 2009

Abstract
Sorting is an important operation in a myriad of applications. It can contribute substantially to the overall execution time of an application. Dedicated sorting architectures can be used to accelerate applications and/or to reduce energy consumption. In this paper, we propose an efficient sorting unit aiming at acceleratin. The sort operation in FPGA-based embedded systems. The proposed sorting unit, named Unbalanced FIFO Merge Sorting Unit, is based on a FIFO merger implementation and is easily scalable to handle different data-set sizes. We show results oy the proposed sorting unit when isolated and when integrated in a software/hardware solution. When using a Xilinx Virtex-5 SX50T FPGA device. The logic resources for a 32 Kword machine is lower than 1%, an. The block RAM usage is about 22%. When compared to a quicksort pure software implementation, our Sorting Unit provides speed-ups from 1.2× to 50× and about 20× when isolated and when integrated in a software/hardware solution, respectively. © 2009 IEEE.

CloseRead Abstract

2009

LALP: A Novel Language to Program Custom FPGA-based Architectures

Authors
Menotti, R; Cardoso, JMP; Fernandes, MM; Marques, E;

Publication
PROCEEDINGS OF THE 21ST INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING

Abstract
Field-Programmable Gate Arrays (FPGAs) are becoming increasingly important in embedded and high-performance computing systems. They allow performance levels close to the ones obtained from Application-Specific Integrated Circuits (ASICs), while still keeping design and implementation flexibility. However to efficiently program FPGAs, one needs the expertise of hardware developers and to master hardware description languages (HDLs) such as VHDL or Verilog. The attempts to furnish a high-level compilation flow (e.g., front C programs) still have open issues before efficient and consistent results can be obtained. Bearing in mind the FPGA resources, we have developed LALP, a novel language to program FPGAs. A compilation framework including mapping capabilities supports the language. The main ideas behind LALP is to provide a higher abstraction level than HDLs, to exploit the intrinsic parallelism of hardware resources, and to permit the programmer to control execution stages whenever the compiler techniques are unable to generate efficient implementations. In this paper we describe LALP, and show how it can be used to achieve high-performance computing solutions.

CloseRead Abstract

2009

AUTOMATIC GENERATION OF FPGA HARDWARE ACCELERATORS USING A DOMAIN SPECIFIC LANGUAGE

Authors
Menotti, R; Cardoso, JMP; Fernandes, MM; Marques, E;

Publication
FPL: 2009 INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE LOGIC AND APPLICATIONS

Abstract
This paper describes an alternative approach to direct mapping loops described in high-level languages onto FPGAs. Different from other approaches, this technique does not inherit from software pipelining techniques. The control is distributed over operations, thus a finite state machine is not necessary to control the order of operations, allowing efficient hardware implementations. The specification of a hardware block is done by means of LALP, a domain specific language specially designed to help the application of the techniques. While the language syntax resembles C, it contains certain constructs that allow programmer interventions to enforce or relax data dependences as needed, and so optimize the performance of the generated hardware blocks.

CloseRead Abstract