Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Publicações

Publicações por CRACS

1998

Distance: A New Metric for Controlling Granularity for Parallel Execution

Autores
Shen, K; Costa, VS; King, A;

Publicação
Proceedings of the 1998 Joint International Conference and Symposium on Logic Programming, Manchester, UK, June 15-19, 1998

Abstract

1997

Thread- and process-based implementations of the pSystem parallel programming environment

Autores
Lopes, LMB; Silva, FMA;

Publicação
SOFTWARE-PRACTICE & EXPERIENCE

Abstract
Run-time work distribution in parallel programming systems is usually accomplished through the use of dynamic scheduling heuristics. Their sensitivity to run-time information such as global work-load, task granularity, data dependencies, locality of information, among others, is essential when trying to optimize performance. Adaptive schedulers that base their decisions on feed-back from the system are therefore of special importance. We have developed and used a general purpose parallel programming system, the pSystem, that also served as a test-bed environment on which we have experimented and studied the performance of distinct scheduling heuristics. Currently, we have two versions of the system: one based on Unix processes; and the other on Solaris threads. Threads (particularly user-level threads) are usually associated with low execution overheads, since they require minimal interaction with the operating system kernel This suggests that lower grain parallelism may be more effectively exploited with a thread-based parallel programming system. Performance analysis of both implementations over a Set of well known benchmarks, with various schedulers, shows that threads scale better under higher system loads and/or when the granularity of the tasks being executed is below a given threshold value. This paper starts with a description of the design and implementation of the pSystem computational model, followed by a detailed description of several experiments and the analysis of their results. (C) 1997 John Wiley & Sons, Ltd.

1997

The SBA: Exploiting orthogonality in AND-OR parallel systems

Autores
Correia, ME; Silva, F; Costa, VS;

Publicação
LOGIC PROGRAMMING - PROCEEDINGS OF THE 1997 INTERNATIONAL SYMPOSIUM

Abstract
One of the advantages of logic programming in the fact that it offers many sources of implicit parallelism, such as and-parallelism and or-parallelism Recently, research has been concentrated on integrating the different forms of parallelism into a single combined system. In this work we concentrate on the problem of integrating or-parallelism and independent and-parallelism for parallel Prolog systems. We contend that previous data structures require pure recomputation and therefore do not allow for orthogonality between and parallelism and or-parallelism. In contrast, we submit that a simpler solution, the sparse binding array, does guarantee this goal, and explain in detail how independent and-parallelism and or-parallelism can thus be efficiently combined.

1997

Evaluating parallel logic programming systems on scalable multiprocessors

Autores
Costa, VS; Bianchini, R; de, CDI;

Publicação
International Symposium on Parallel Symbolic Computation, Proceedings, PASCO

Abstract
Parallel logic programming systems are sophisticated examples of symbolic computing systems. They address problems such as dynamic memory allocation, scheduling irregular execution patterns, and managing different types of implicit parallelism. Most parallel logic programming systems have been developed for bus-based shared-memory architectures. The complexity of parallel logic programming systems and the large amount of data they process raises the question of whether logic programming systems can still obtain good performance on scalable architectures, such as distributed shared-memory systems. In this work we use execution-driven simulation to investigate the access patterns and caching behaviour exhibited by a parallel logic programming system, Andorra-I. We show that the system obtains reasonable performance, but that it does not scale well. By studying the behaviour of the major data structures in Andorra-I in detail, we conclude that this result is largely a consequence of the scheduling and work manipulation implementation used in the system. We also show that the Andorra-I's data structures exhibit widely-varying memory access patterns and caching behaviour, which not only depend on the number of processors, but also on the amount and type of parallelism available in the application program. Some of these data structures clearly favour invalidate-based cache coherence protocols, while others favour update-based protocols. Since most of Andorra-I's data structures are common to other parallel logic programming systems, we believe that these systems can greatly benefit from flexible coherence schemes where either the compiler can specify the protocol to be used for each data structure or the protocol can adapt to varying memory access patterns.

1997

Evaluating the impact of coherence protocols on parallel logic programming systems

Autores
Costa, VS; Bianchini, R; Dutra, IdC;

Publicação
Fifth Euromicro Workshop on Parallel and Distributed Processing (PDP '97), January 22-24, 1997, University of Westminster, London, UK

Abstract

1996

Performance of Sparse Binding Arrays for Or-Parallelism

Autores
Costa, VS; Correia, ME; Silva, F;

Publicação
Anais do VIII International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD 1996)

Abstract
One important problem in the design of novel logic programming systems is the support of several forms of implicit parallelism. A new binding model, the Sparse Binding Array (SBA), has been proposed for the efficient and simplified integration of Independent-And, Determinate-And and Or-parallelism. In this paper we report on the use of this model for pure Or-parallelism. The work discusses the major implementation issues in supporting this binding model for pure Or-parallelism. We show that an implementation based on this Binding model is more efficient then the original Aurora using tbe traditional Binding Array model [16]. Moreover, we explain how the notion of a variable level can be used to reduce overheads of the Orparallel system. Our results in supporting pure or-parallelism show that the approach is very promissing for combined paralell systems.

  • 186
  • 192