2014
Autores
Bellodi, E; Lamma, E; Riguzzi, F; Costa, VS; Zese, R;
Publicação
THEORY AND PRACTICE OF LOGIC PROGRAMMING
Abstract
Lifted inference has been proposed for various probabilistic logical frameworks in order to compute the probability of queries in a time that depends on the size of the domains of the random variables rather than the number of instances. Even if various authors have underlined its importance for probabilistic logic programming (PLP), lifted inference has been applied up to now only to relational languages outside of logic programming. In this paper we adapt Generalized Counting First Order Variable Elimination (GC-FOVE) to the problem of computing the probability of queries to probabilistic logic programs under the distribution semantics. In particular, we extend the Prolog Factor Language (PFL) to include two new types of factors that are needed for representing ProbLog programs. These factors take into account the existing causal independence relationships among random variables and are managed by the extension to variable elimination proposed by Zhang and Poole for dealing with convergent variables and heterogeneous factors. Two new operators are added to GC-FOVE for treating heterogeneous factors. The resulting algorithm, called LP2 for Lifted Probabilistic Logic Programming, has been implemented by modifying the PFL implementation of GC-FOVE and tested on three benchmarks for lifted inference. A comparison with PITA and ProbLog2 shows the potential of the approach.
2017
Autores
Paes, A; Zaverucha, G; Costa, VS;
Publicação
MACHINE LEARNING
Abstract
Theory Revision from Examples is the process of repairing incorrect theories and/or improving incomplete theories from a set of examples. This process usually results in more accurate and comprehensible theories than purely inductive learning. However, so far, progress on the use of theory revision techniques has been limited by the large search space they yield. In this article, we argue that it is possible to reduce the search space of a theory revision system by introducing stochastic local search. More precisely, we introduce a number of stochastic local search components at the key steps of the revision process, and implement them on a state-of-the-art revision system that makes use of the most specific clause to constrain the search space. We show that with the use of these SLS techniques it is possible for the revision system to be executed in a feasible time, while still improving the initial theory and in a number of cases even reaching better accuracies than the deterministic revision process. Moreover, in some cases the revision process can be faster and still achieve better accuracies than an ILP system learning from an empty initial hypothesis or assuming an initial theory to be correct.
2015
Autores
Davis, J; Costa, VS; Peissig, PL; Caldwell, M; Page, D;
Publicação
Foundations of Biomedical Knowledge Representation - Methods and Applications
Abstract
2019
Autores
Oliveira, M; Torgo, L; Costa, VS;
Publicação
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2018, PT I
Abstract
The amount of available spatio-temporal data has been increasing as large-scale data collection (e.g., from geosensor networks) becomes more prevalent. This has led to an increase in spatio-temporal forecasting applications using geo-referenced time series data motivated by important domains such as environmental monitoring (e.g., air pollution index, forest fire risk prediction). Being able to properly assess the performance of new forecasting approaches is fundamental to achieve progress. However, the dependence between observations that the spatio-temporal context implies, besides being challenging in the modelling step, also raises issues for performance estimation as indicated by previous work. In this paper, we empirically compare several variants of cross-validation (CV) and out-of-sample (OOS) performance estimation procedures that respect data ordering, using both artificially generated and real-world spatio-temporal data sets. Our results show both CV and OOS reporting useful estimates. Further, they suggest that blocking may be useful in addressing CV's bias to underestimate error. OOS can be very sensitive to test size, as expected, but estimates can be improved by careful management of the temporal dimension in training.
2019
Autores
Ferreira, CA; Gama, J; Costa, VS;
Publicação
PROGRESS IN ARTIFICIAL INTELLIGENCE
Abstract
In this paper, we present the BeamSouL sequence miner that finds sequences of logical atoms. This algorithm uses a levelwise hybrid search strategy to find a subset of contrasting logical sequences available in a SeqLog database. The hybrid search strategy runs an exhaustive search, in the first phase, followed by a beam search strategy. In the beam search phase, the algorithm uses the confidence metric to select the top k sequential patterns that will be specialized in the next level. Moreover, we develop a first-order logic classification framework that uses predicate invention technique to include the BeamSouL findings in the learning process. We evaluate the performance of our proposals using four multi-relational databases. The results are promising, and the BeamSouL algorithm can be more than one order of magnitude faster than the baseline and can find long and highly discriminative contrasting sequences.
2019
Autores
Oliveira, M; Moniz, N; Torgo, L; Costa, VS;
Publicação
2019 IEEE INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA 2019)
Abstract
Extreme and rare events, such as abnormal spikes in air pollution or weather conditions can have serious repercussions. Many of these sorts of events develop from spatio-temporal processes, and accurate predictions are a most valuable tool in addressing their impact, in a timely manner. In this paper, we propose a new set of resampling strategies for imbalanced spatiotemporal forecasting tasks, by introducing bias into formerly random processes. This spatio-temporal bias includes a hyperparameter that regulates the relative importance of the temporal and spatial dimensions in the selection of observations during under- or over-sampling. We test and compare our proposals against standard versions of the strategies on 10 different georeferenced numeric time series, using 3 distinct off-the-shelf learning algorithms. Experimental results show that our proposal provides an advantage over random resampling strategies in imbalanced spatio-temporal forecasting tasks. Additionally, we also find that valuing an observation's recency is more useful when over-sampling; while valuing its spatial distance to other cases with extreme values is more beneficial when under-sampling.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.