Publications

Publications by CRACS

2019

Memory Reclamation Methods for Lock-Free Hash Tries

Authors
Moreno, P; Areias, M; Rocha, R;

Publication
2019 31ST INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD 2019)

Abstract
Hash tries are a trie-based data structure with nearly ideal characteristics for the implementation of hash maps. Starting from a particular lock-free hash map data structure, named Lock-Free Hash Tries (LFHT), we focus on solving the problem of memory reclamation without losing the lock-freedom property. We propose an approach that explores the characteristics of the LFHT structure in order to achieve efficient memory reclamation with low and well-defined memory bounds. Experimental results show that our approach obtains better results when compared with other state-of-the-art memory reclamation methods and provides a competitive and scalable hash map implementation, if compared to lock-based implementations.

CloseRead Abstract

2019

Evaluation Procedures for Forecasting with Spatio-Temporal Data

Authors
Oliveira, M; Torgo, L; Costa, VS;

Publication
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2018, PT I

Abstract
The amount of available spatio-temporal data has been increasing as large-scale data collection (e.g., from geosensor networks) becomes more prevalent. This has led to an increase in spatio-temporal forecasting applications using geo-referenced time series data motivated by important domains such as environmental monitoring (e.g., air pollution index, forest fire risk prediction). Being able to properly assess the performance of new forecasting approaches is fundamental to achieve progress. However, the dependence between observations that the spatio-temporal context implies, besides being challenging in the modelling step, also raises issues for performance estimation as indicated by previous work. In this paper, we empirically compare several variants of cross-validation (CV) and out-of-sample (OOS) performance estimation procedures that respect data ordering, using both artificially generated and real-world spatio-temporal data sets. Our results show both CV and OOS reporting useful estimates. Further, they suggest that blocking may be useful in addressing CV's bias to underestimate error. OOS can be very sensitive to test size, as expected, but estimates can be improved by careful management of the temporal dimension in training.

CloseRead Abstract

2019

Contrasting logical sequences in multi-relational learning

Authors
Ferreira, CA; Gama, J; Costa, VS;

Publication
PROGRESS IN ARTIFICIAL INTELLIGENCE

Abstract
In this paper, we present the BeamSouL sequence miner that finds sequences of logical atoms. This algorithm uses a levelwise hybrid search strategy to find a subset of contrasting logical sequences available in a SeqLog database. The hybrid search strategy runs an exhaustive search, in the first phase, followed by a beam search strategy. In the beam search phase, the algorithm uses the confidence metric to select the top k sequential patterns that will be specialized in the next level. Moreover, we develop a first-order logic classification framework that uses predicate invention technique to include the BeamSouL findings in the learning process. We evaluate the performance of our proposals using four multi-relational databases. The results are promising, and the BeamSouL algorithm can be more than one order of magnitude faster than the baseline and can find long and highly discriminative contrasting sequences.

CloseRead Abstract

2019

Biased Resampling Strategies for Imbalanced Spatio-Temporal Forecasting

Authors
Oliveira, M; Moniz, N; Torgo, L; Costa, VS;

Publication
2019 IEEE INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA 2019)

Abstract
Extreme and rare events, such as abnormal spikes in air pollution or weather conditions can have serious repercussions. Many of these sorts of events develop from spatio-temporal processes, and accurate predictions are a most valuable tool in addressing their impact, in a timely manner. In this paper, we propose a new set of resampling strategies for imbalanced spatiotemporal forecasting tasks, by introducing bias into formerly random processes. This spatio-temporal bias includes a hyperparameter that regulates the relative importance of the temporal and spatial dimensions in the selection of observations during under- or over-sampling. We test and compare our proposals against standard versions of the strategies on 10 different georeferenced numeric time series, using 3 distinct off-the-shelf learning algorithms. Experimental results show that our proposal provides an advantage over random resampling strategies in imbalanced spatio-temporal forecasting tasks. Additionally, we also find that valuing an observation's recency is more useful when over-sampling; while valuing its spatial distance to other cases with extreme values is more beneficial when under-sampling.

CloseRead Abstract

2019

A Three-Valued Semantics for Typed Logic Programming

Authors
Barbosa, J; Florido, M; Costa, VS;

Publication
Proceedings 35th International Conference on Logic Programming (Technical Communications), ICLP 2019 Technical Communications, Las Cruces, NM, USA, September 20-25, 2019.

Abstract
Types in logic programming have focused on conservative approximations of program semantics by regular types, on one hand, and on type systems based on a prescriptive semantics defined for typed programs, on the other. In this paper, we define a new semantics for logic programming, where programs evaluate to true, false, and to a new semantic value called wrong, corresponding to a run-time type error. We then have a type language with a separated semantics of types. Finally, we define a type system for logic programming and prove that it is semantically sound with respect to a semantic relation between programs and types where, if a program has a type, then its semantics is not wrong. Our work follows Milner’s approach for typed functional languages where the semantics of programs is independent from the semantic of types, and the type system is proved to be sound with respect to a relation between both semantics.

CloseRead Abstract

2019

Machine Learning to Predict Developmental Neurotoxicity with High-Throughput Data from 2D Bio-Engineered Tissues

Authors
Kuusisto, F; Costa, VS; Hou, Z; Thomson, JA; Page, D; Stewart, RM;

Publication
18th IEEE International Conference On Machine Learning And Applications, ICMLA 2019, Boca Raton, FL, USA, December 16-19, 2019

Abstract
There is a growing need for fast and accurate methods for testing developmental neurotoxicity across several chemical exposure sources. Current approaches, such as in vivo animal studies, and assays of animal and human primary cell cultures, suffer from challenges related to time, cost, and applicability to human physiology. Prior work has demonstrated success employing machine learning to predict developmental neurotoxicity using gene expression data collected from human 3D tissue models exposed to various compounds. The 3D model is biologically similar to developing neural structures, but its complexity necessitates extensive expertise and effort to employ. By instead focusing solely on constructing an assay of developmental neurotoxicity, we propose that a simpler 2D tissue model may prove sufficient. We thus compare the accuracy of predictive models trained on data from a 2D tissue model with those trained on data from a 3D tissue model, and find the 2D model to be substantially more accurate. Furthermore, we find the 2D model to be more robust under stringent gene set selection, whereas the 3D model suffers substantial accuracy degradation. While both approaches have advantages and disadvantages, we propose that our described 2D approach could be a valuable tool for decision makers when prioritizing neurotoxicity screening. © 2019 IEEE.

CloseRead Abstract