2004
Authors
Baradaran, N; Park, J; Diniz, PC;
Publication
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Abstract
2004
Authors
Diniz, PC;
Publication
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Abstract
Analyses and transformations of programs that manipulate pointer-based data structures rely on understanding the topological relationships between the nodes i.e., the overall shape of the data structures. Current static shape analyses either assume correctness of the code or trade-off accuracy for analysis performance, leading in most cases to shape information that is of little use for practical purposes. This paper introduces four novel analysis techniques, namely structural fields, scan loops, assumed/verified shape properties and context tracing. Analysis of structural fields allows compilers to uncover node configurations that play key roles in the data structure. Analysis of scan loops allows compilers to establish accurate relationship between pointer variables while traversing the data structures. Assumed/verified property analysis derives sufficient shape properties that guarantee termination of scan loops. These properties must then be verified during shape analysis for consistency. Context tracing allows the analyses to isolate data structure nodes by tracing relationships between pointer variables along control-flow paths in the program. We believe that future static shape and safety analysis algorithms will have to include some if not all of these techniques to attain a high level of accuracy. In this paper we illustrate the application of the proposed techniques to codes that build (correctly as well as incorrectly) data structures that are beyond the reach of current approaches. © Springer-Verlag 2004.
2004
Authors
Diniz, PC;
Publication
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Abstract
Reconfigurable computing architectures promise to substantially increase the performance of computations through the customization of data-path and storage structures best suited to the specific needs of each computation. The need to synthesize, either fully or partially, the structure of the target architecture while simultaneously attempting to optimize the mapping of the computation to that architecture creates a vast design space exploration (DSE) challenge. In this paper we describe current approaches to this DSE problem using program analysis, estimation, modeling and empirical optimization techniques. We also describe a unified approach for this DSE challenge in which these techniques can be complemented with history- and learning-based approaches. © Springer-Verlag Berlin Heidelberg 2004.
2004
Authors
Baradaran, N; Park, J; Diniz, PC;
Publication
Proceedings - 2004 IEEE International Conference on Field-Programmable Technology, FPT '04
Abstract
Contemporary configurable architectures have dedicated internal functional units such as multipliers, high-capacity storage RAM, and even CAM blocks. These RAM blocks allow the implementations to cache data to be reused in the near future, thereby avoiding the latency of external memory accesses. In this paper we present a data allocation algorithm that utilizes the RAM blocks in the presence of a limited number of hardware registers. This algorithm, based on a compiler data reuse analysis, determines which data should be cached in the internal RAM blocks and when. The preliminary results, for a set of image/signal processing kernels targeting a Xilinx Virtex™ FPGA device, reveal that despite the increase latency of accessing data in RAM blocks, designs that use them require smaller configurable resources than designs that exclusively use registers, while attaining comparable and in some cases even better performance. © 2004 IEEE.
2004
Authors
Diniz, P; Lee, YJ; Hall, M; Lucas, R;
Publication
Proceedings - International Parallel and Distributed Processing Symposium, IPDPS 2004 (Abstracts and CD-ROM)
Abstract
This paper describes initial experiences with semi-automated performance tuning of a linear solver in LS-DYNA, a large, widely-used engineering application. Through a collection of tools supporting empirical optimization, we alleviate the burden of performance tuning for mapping today's complex software to increasingly complex hardware platforms. We describe a tool that automatically isolates code segments for the purposes of performance tuning, code generation issues, and we present a collection of automatically-generated performance results for specific performance-oriented parameters.
2005
Authors
Baradaran, N; Diniz, P;
Publication
ARC 2005 - International Workshop on Applied Reconfigurable Computing 2005
Abstract
Current high-end Field-Programmable-Gate-Array (FPGA) parts offer a large number of configurable resources. These can be organized in custom storage structures such as tapped-delay lines, in addition to a number of very dense high-capacity Random-Access-Memory (RAM) and Content-Addressable-Memory (CAM) blocks. The extreme flexibility of the size, organization and interconnection between these storage resources enables compilers to generate custom hardware designs tailored to capture the application-specific data reuse opportunities. In this paper we outline the basic compiler data dependence analyses approaches that can be used to uncover reuse opportunities within a loop nest. We then describe the challenges of exploiting these opportunities in modern FPGAs.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.