Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Publicações

Publicações por João Paiva Cardoso

2010

Compiling for Reconfigurable Computing: A Survey

Autores
Cardoso, JMP; Diniz, PC; Weinhardt, M;

Publicação
ACM COMPUTING SURVEYS

Abstract
Reconfigurable computing platforms offer the promise of substantially accelerating computations through the concurrent nature of hardware structures and the ability of these architectures for hardware customization. Effectively programming such reconfigurable architectures, however, is an extremely cumbersome and error-prone process, as it requires programmers to assume the role of hardware designers while mastering hardware description languages, thus limiting the acceptance and dissemination of this promising technology. To address this problem, researchers have developed numerous approaches at both the programming languages as well as the compilation levels, to offer high-level programming abstractions that would allow programmers to easily map applications to reconfigurable architectures. This survey describes the major research efforts on compilation techniques for reconfigurable computing architectures. The survey focuses on efforts that map computations written in imperative programming languages to reconfigurable architectures and identifies the main compilation and synthesis techniques used in this mapping.

2009

The current feasibility of gesture recognition for a smartphone using J2ME

Autores
Tarrataca, L; Santos, AC; Cardoso, JMP;

Publicação
Proceedings of the ACM Symposium on Applied Computing

Abstract
The need to improve communication between humans and computers has been instrumental in defining new communication models, and accordingly, new ways of interacting with machines. The use of gestures as a means of communication has been a challenging task. The latest generation of smartphones boasts powerful processors and built-in video cameras, making them capable of executing complex and computationally demanding applications. Thus, the integration of gesture recognition systems in smartphone applications might be a close reality. In this paper, we present studies of a gesture recognition prototype system for smartphones. We use a number of tasks typically employed in gesture recognition systems which permit to assess the current feasibility of smartphones to implement this kind of systems. Based on both the execution time and classification performance, we conclude that the latest smartphone generation is capable of executing complex image processing applications, with the most penalizing factor being camera performance regarding capture rates with the current J2ME support. Copyright 2009 ACM.

2005

Dynamic loop pipelining in data-driven architectures

Autores
Cardoso, JMP;

Publicação
2005 Computing Frontiers Conference

Abstract
Data-driven array architectures seem to be important alternatives for coarse-grained reconfigurable computing platforms. Their use has provided performance improvements over microprocessors and shorter programming cycles than FPGA-based platforms. As with other architectures, in data-driven architectures loop pipelining plays an important role to improve performance. Usually this kind of pipelining can be achieved using the dataflow software pipelining technique or other software pipelining approaches. Although performance improvements are achieved, those techniques heavily depend on the insertion of pipelining stages and thus require complex balancing efforts. Furthermore, those techniques statically define the pipelining and do not take fully advantage of the dynamic scheduling attainable by the data-driven concept. This paper presents a novel scheme to pipeline loops in data-driven architectures, orchestrated by a handshaking protocol. Using the new approach, self loop pipelining is naturally achieved. The scheme is based on duplicating cyclic hardware structures, in order they are autonomously executed, with synchronization being achieved by the data flow. It can be applied to nested loops, requires less aggressive pipeline balancing efforts than usual software pipelining techniques, and innermost loops with conditional structures can be pipelined without conservative pipelining implementations. We show results of using the proposed technique when mapping algorithms in imperative programming languages to the PACT eXtreme Processing Platform (XPP). The results confirm improvements over the use of conventional loop pipelining techniques. Better performance and fewer resources are achieved in a number of cases. Copyright 2005 ACM.

2003

Loop dissevering: A technique for temporally partitioning loops in dynamically reconfigurable computing platforms

Autores
Cardoso, JMP;

Publicação
Proceedings - International Parallel and Distributed Processing Symposium, IPDPS 2003

Abstract
This paper presents a technique, called loop dissevering, for temporally partitioning any type of loop presented in programming languages. The technique can be used in the presence of complex loops that oversize the physically available hardware resources. Unlike loop fission or distribution, the technique can be applied to all types of loops and it is not constrained by loop dependences. Thus, the technique guarantees the compilation of complex loops that otherwise cannot be mapped to the target reconfigurable computing architecture. Moreover, the technique only needs to communicate scalar variables between temporal partitions (configurations) and does not need auxiliary array variables used for scalar expansion when applying loop distribution. We show the results of applying the technique when compiling C programs to the PACT eXtreme Processing Platform (XPP) and to a hypothetical version with faster switching between contexts. We show that the technique leads to implementations using fewer resources and might lead to performance improvements when it is possible to overlap some of the execution stages (e.g., fetch, configure, and compute). As performance is concerned, the technique is most efficient and the reconfiguration time is fast. © 2003 IEEE.

2008

IJE special issue on reconfigurable hardware systems

Autores
Cardoso, JMP; Diniz, PC;

Publicação
INTERNATIONAL JOURNAL OF ELECTRONICS

Abstract
Three articles focusing on interesting architectural features and/or execution techniques which configurable architectures make more accessible are discussed. The Applied Reconfigurable Computing (ARC) workshop series has been devoted to addressing the role of software programmers and hardware designers in implementing configurable and reconfigurable architectures while still recognizing the value of configurable computing basic techniques and application areas. The first article, by Wu, Kanstein, Madsen and Berekovic, describes the application of multithreading to a coarse-grain reconfigurable architecture. The second article by Chikhi, Derrien, Noumsi and Quinton, is devoted to a specific architectural feature, the inclusion of FLASH memory to facilitate the implementation of image-based algorithms, an application that matches very well with FPGA configurable technology. Finally, a third article in this track by Hur, Wong and Vassiliadis, explores the use of point-to-point interconnects in a contemporary FPGA.

2008

Synthesis of regular expressions for FPGAs

Autores
Bispo, J; Cardoso, JMP;

Publicação
INTERNATIONAL JOURNAL OF ELECTRONICS

Abstract
Regular expressions are being used in many applications to specify multiple and complex text patterns in a compact way. In some of these applications large sets of regular expressions need to be evaluated to detect matched content. Specialised hardware engines are employed when software-based regular expression engines are not able to meet the performance requirements imposed by such applications. Since the sets of regular expressions are periodically modified and/or extended, FPGAs are an attractive hardware solution to achieve both programmability and high-performance demands. However, efficient automatic synthesis tools are of paramount importance to achieve fast prototyping of regular expression engines on these devices. This paper presents an overview of the synthesis of regular expressions with the aim of achieving high-performance engines for FPGAs. We focus on describing current solutions, proposing new solutions for constraint repetitions and overlapped matching, and discussing a number of challenges and open issues. As a case study, we present FPGA implementations of the regular expressions included in two rule-sets of network intrusion detection system (NIDS), Bleeding Edge and Snort, obtained using a state-of-the-art synthesis approach.

  • 35
  • 43