Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by CRACS

2016

A Lock-Free Hash Trie Design for Concurrent Tabled Logic Programs

Authors
Areias, M; Rocha, R;

Publication
INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING

Abstract
Tabling is an implementation technique that improves the declarativeness and expressiveness of Prolog systems in dealing with recursion and redundant sub-computations. A critical component in the design of a concurrent tabling system is the implementation of the table space. One of the most successful proposals for representing tables is based on a two-level trie data structure, where one trie level stores the tabled subgoal calls and the other stores the computed answers. In previous work, we have presented a sophisticated lock-free design where both levels of the tries where shared among threads in a concurrent environment. To implement lock-freedom we used the CAS atomic instruction that nowadays is widely found on many common architectures. CAS reduces the granularity of the synchronization when threads access concurrent areas, but still suffers from problems such as false sharing or cache memory effects. In this work, we present a simpler and efficient lock-free design based on hash tries that minimizes these problems by dispersing the concurrent areas as much as possible. Experimental results in the Yap Prolog system show that our new lock-free design can effectively reduce the execution time and scales better than previous designs.

2016

Estimation-Based Search Space Traversal in PILP Environments

Authors
Real, JC; Dutra, I; Rocha, R;

Publication
Inductive Logic Programming - 26th International Conference, ILP 2016, London, UK, September 4-6, 2016, Revised Selected Papers

Abstract
Probabilistic Inductive Logic Programming (PILP) systems extend ILP by allowing the world to be represented using probabilistic facts and rules, and by learning probabilistic theories that can be used to make predictions. However, such systems can be inefficient both due to the large search space inherited from the ILP algorithm and to the probabilistic evaluation needed whenever a new candidate theory is generated. To address the latter issue, this work introduces probability estimators aimed at improving the efficiency of PILP systems. An estimator can avoid the computational cost of probabilistic theory evaluation by providing an estimate of the value of the combination of two subtheories. Experiments are performed on three real-world datasets of different areas (biology, medical and web-based) and show that, by reducing the number of theories to be evaluated, the estimators can significantly shorten the execution time without losing probabilistic accuracy. © Springer International Publishing AG 2017.

2016

Parallel Algorithms for Multirelational Data Mining: Application to Life Science Problems

Authors
Camacho, R; Barbosa, JG; Sampaio, AM; Ladeiras, J; Fonseca, NA; Costa, VS;

Publication
Resource Management for Big Data Platforms - Algorithms, Modelling, and High-Performance Computing Techniques

Abstract

2016

Processing Markov Logic Networks with GPUs: Accelerating Network Grounding

Authors
Alberto Martinez Angeles, CA; Dutra, I; Costa, VS; Buenabad Chavez, J;

Publication
INDUCTIVE LOGIC PROGRAMMING, ILP 2015

Abstract
Markov Logic is an expressive and widely used knowledge representation formalism that combines logic and probabilities, providing a powerful framework for inference and learning tasks. Most Markov Logic implementations perform inference by transforming the logic representation into a set of weighted propositional formulae that encode a Markov network, the ground Markov network. Probabilistic inference is then performed over the grounded network. Constructing, simplifying, and evaluating the network are the main steps of the inference phase. As the size of a Markov network can grow rather quickly, Markov Logic Network (MLN) inference can become very expensive, motivating a rich vein of research on the optimization of MLN performance. We claim that parallelism can have a large role on this task. Namely, we demonstrate that widely available Graphics Processing Units (GPUs) can be used to improve the performance of a state-of-the-art MLN system, Tuffy, with minimal changes. Indeed, comparing the performance of our GPU-based system, TuGPU, to that of the Alchemy, Tuffy and RockIt systems on three widely used applications shows that TuGPU is up to 15x times faster than the other systems.

2016

Predicting Wildfires Propositional and Relational Spatio-Temporal Pre-processing Approaches

Authors
Oliveira, M; Torgo, L; Costa, VS;

Publication
DISCOVERY SCIENCE, (DS 2016)

Abstract
We present and evaluate two different methods for building spatio-temporal features: a propositional method and a method based on propositionalisation of relational clauses. Our motivating application, a regression problem, requires the prediction of the fraction of each Portuguese parish burnt yearly by wildfires - a problem with a strong socio-economic and environmental impact in the country. We evaluate and compare how these methods perform individually and combined together. We successfully use under-sampling to deal with the high skew in the data set. We find that combining the approaches significantly improves the similar results obtained by each method individually.

2016

Relational Learning with GPUs: Accelerating Rule Coverage

Authors
Alberto Martinez Angeles, CA; Wu, HC; Dutra, I; Costa, VS; Buenabad Chavez, J;

Publication
INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING

Abstract
Relational learning algorithms mine complex databases for interesting patterns. Usually, the search space of patterns grows very quickly with the increase in data size, making it impractical to solve important problems. In this work we present the design of a relational learning system, that takes advantage of graphics processing units (GPUs) to perform the most time consuming function of the learner, rule coverage. To evaluate performance, we use four applications: a widely used relational learning benchmark for predicting carcinogenesis in rodents, an application in chemo-informatics, an application in opinion mining, and an application in mining health record data. We compare results using a single and multiple CPUs in a multicore host and using the GPU version. Results show that the GPU version of the learner is up to eight times faster than the best CPU version.

  • 85
  • 192