Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by João Paiva Cardoso

2005

New challenges in computer science education

Authors
Cardoso, JMP;

Publication
Proceedings of the 10th Annual SIGCSE Conference on Innovation and Technology in Computer Science Education

Abstract
It is predicted that by the year 2010, 90% of the overall program code developed will be for embedded computing systems. This fact requires urgent changes in the organization of the current computer science curriculums, as advocated by a number of academics. The changes will help students deal with the idiosyncrasies of embedded systems, which requires knowledge about the computation engine, its energy consumption model, performance, interfaced artifacts, reconfigurable hardware programming, etc. This paper discusses some important issues to be included in modern computer science programs, in order to prepare students to be able to program future embedded computers. In particular, we present an approach we are attempting to implement at our institution. We also illustrate infrastructures that permit students to implement complex examples and gain deep knowledge about the topics being taught. Finally, with this paper we hope to foment a fruitful discussion on those issues. Copyright 2005 ACM.

2000

An enhanced static-list scheduling algorithm for temporal partitioning onto RPUs

Authors
Cardoso, JMP; Neto, HC;

Publication
VLSI: SYSTEMS ON A CHIP

Abstract
This paper presents a novel algorithm for temporal partitioning of graphs representing a behavioral description. The algorithm is based on an extension of the traditional static-list scheduling that tailors it to resolve both scheduling and temporal partitioning. The nodes to be mapped into a partition are selected based on a statically computed cost model. The cost for each node integrates communication effects, the critical path length, and the possibility of the critical path to hide the delay of parallel nodes. In order to alleviate the runtime there is no dynamic update of the costs. A comparison of the algorithm to other schedulers and with close-to-optimum results obtained with a simulated annealing approach is shown. The presented algorithm has been implemented and the results show that it is robust, effective, and efficient, and when compared to other methods finds very good results in small amounts of CPU time.

2007

Reconfigurable Computing: Architectures, Tools and Applications, Third International Workshop, ARC 2007, Mangaratiba, Brazil, March 27-29, 2007

Authors
Diniz, PC; Marques, E; Bertels, K; Fernandes, MM; Cardoso, JMP;

Publication
ARC

Abstract

2005

Editorial message for the special track on embedded systems: Applications, solutions, and techniques

Authors
Bechini, A; Bodin, F; Prete, CA; Bartolini, S; Buttazzo, G; Cardoso, JMP; Dang, T; Engels, M; Foglia, P; Giorgi, R; Jha, NK; Knijnenburg, P; Krall, A; Kuo, TW; Ledeczi, A; Liu, J; Memik, G; O'Boyle, M; Schants, R; Sips, HJ; Talpin, JP; Vassiliadis, S; Yen, IL;

Publication
Proceedings of the ACM Symposium on Applied Computing

Abstract

2004

An environment for exploring data-driven architectures

Authors
Ferreira, R; Cardoso, JMP; Neto, HC;

Publication
FIELD-PROGRAMMABLE LOGIC AND APPLICATIONS, PROCEEDINGS

Abstract
A wide range of reconfigurable coarse-grain architectures has been proposed in recent years, for an extensive set of applications. These architectures vary widely in the interconnectivity, number, granularity and complexity of the processing elements (PEs). The performance of a specific application usually depends heavily on the adequacy of the PEs to the particular tasks involved, but tools to efficiently experiment architectural features are lacking. This work proposes an environment for exploration and simulation of coarse-grain reconfigurable data-driven architectures. The proposed environment takes advantage of Java and XML technologies to enable a very efficient backend for experiments with different architectural trade-offs, from the array connectivity and topology to the granularity and complexity of each PE. For a proof of concept, we show results on implementing different versions of a FIR filter on a hexagonal data-driven array.

2008

Regular expression matching in reconfigurable hardware

Authors
Sourdis, I; Vassiliadis, S; Bispo, J; Cardoso, JMP;

Publication
JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY

Abstract
In this paper we describe a regular expression pattern matching approach for reconfigurable hardware. Following a Non-deterministic Finite Automata direction, we introduce three new basic building blocks to support constraint repetitions syntaxes more efficiently than previous works. In addition, a number of optimization techniques are employed to reduce the area cost of the designs and maximize performance. Our design methodology is supported by a tool that automatically generates the circuitry for the given regular expressions and outputs Hardware Description Language representations ready for logic synthesis. The proposed approach is evaluated on network Intrusion Detection Systems (IDS). Recent IDS use regular expressions to represent hazardous packet payload contents. They require high-speed packet processing providing a challenging case study for pattern matching using regular expressions. We use a number of IDS rulesets to show that our approach scales well as the number of regular expressions increases, and present a step-by-step optimization to survey the benefits of our techniques. The synthesis tool described in this study is used to generate hardware engines to match 300 to 1,500 IDS regular expressions using only 10-45 K logic cells and achieving throughput of 1.6-2.2 and 2.4-3.2 Gbps on Virtex2 and Virtex4 devices, respectively. Concerning the throughput per area required per matching non-Meta character, our hardware engines are 10-20 x more efficient than previous Field Programmable Gate Array approaches. Furthermore, the generated designs have comparable area requirements to current application-specific integrated circuit solutions.

  • 31
  • 44