Publications

Publications by AI

2020

Mining Human Mobility Data to Discover Locations and Habits

Authors
Andrade, T; Cancela, B; Gama, J;

Publication
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2019, PT II

Abstract
Many aspects of life are associated with places of human mobility patterns and nowadays we are facing an increase in the pervasiveness of mobile devices these individuals carry. Positioning technologies that serve these devices such as the cellular antenna (GSM networks), global navigation satellite systems (GPS), and more recently the WiFi positioning system (WPS) provide large amounts of spatio-temporal data in a continuous way. Therefore, detecting significant places and the frequency of movements between them is fundamental to understand human behavior. In this paper, we propose a method for discovering user habits without any a priori or external knowledge by introducing a density-based clustering for spatio-temporal data to identify meaningful places and by applying a Gaussian Mixture Model (GMM) over the set of meaningful places to identify the representations of individual habits. To evaluate the proposed method we use two real-world datasets. One dataset contains high-density GPS data and the other one contains GSM mobile phone data in a coarse representation. The results show that the proposed method is suitable for this task as many unique habits were identified. This can be used for understanding users' behavior and to draw their characterizing profiles having a panorama of the mobility patterns from the data.

CloseRead Abstract

2019

Heart Sounds Classification Using Images from Wavelet Transformation

Authors
Nogueira, DM; Zarmehri, MN; Ferreira, CA; Jorge, AM; Antunes, L;

Publication
PROGRESS IN ARTIFICIAL INTELLIGENCE, EPIA 2019, PT I

Abstract
Cardiovascular disease is the leading cause of death around the world and its early detection is a key to improving long-term health outcomes. To detect possible heart anomalies at an early stage, an automatic method enabling cardiac health low-cost screening for the general population would be highly valuable. By analyzing the phonocardiogram (PCG) signals, it is possible to perform cardiac diagnosis and find possible anomalies at an early-term. Accordingly, the development of intelligent and automated analysis tools of the PCG is very relevant. In this work, the PCG signals are studied with the main objective of determining whether a PCG signal corresponds to a “normal” or “abnormal” physiological state. The main contribution of this work is the evidence provided that time domain features can be combined with features extracted from a wavelet transformation of PCG signals to improve automatic cardiac disease classification. We empirically demonstrate that, from a pool of alternatives, the best classification results are achieved when both time and wavelet features are used by a Support Vector Machine with a linear kernel. Our approach has obtained better results than the ones reported by the challenge participants which use large amounts of data and high computational power. © Springer Nature Switzerland AG 2019.

CloseRead Abstract

2019

Evaluation Procedures for Forecasting with Spatio-Temporal Data

Authors
Oliveira, M; Torgo, L; Costa, VS;

Publication
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2018, PT I

Abstract
The amount of available spatio-temporal data has been increasing as large-scale data collection (e.g., from geosensor networks) becomes more prevalent. This has led to an increase in spatio-temporal forecasting applications using geo-referenced time series data motivated by important domains such as environmental monitoring (e.g., air pollution index, forest fire risk prediction). Being able to properly assess the performance of new forecasting approaches is fundamental to achieve progress. However, the dependence between observations that the spatio-temporal context implies, besides being challenging in the modelling step, also raises issues for performance estimation as indicated by previous work. In this paper, we empirically compare several variants of cross-validation (CV) and out-of-sample (OOS) performance estimation procedures that respect data ordering, using both artificially generated and real-world spatio-temporal data sets. Our results show both CV and OOS reporting useful estimates. Further, they suggest that blocking may be useful in addressing CV's bias to underestimate error. OOS can be very sensitive to test size, as expected, but estimates can be improved by careful management of the temporal dimension in training.

CloseRead Abstract

2019

Contrasting logical sequences in multi-relational learning

Authors
Ferreira, CA; Gama, J; Costa, VS;

Publication
PROGRESS IN ARTIFICIAL INTELLIGENCE

Abstract
In this paper, we present the BeamSouL sequence miner that finds sequences of logical atoms. This algorithm uses a levelwise hybrid search strategy to find a subset of contrasting logical sequences available in a SeqLog database. The hybrid search strategy runs an exhaustive search, in the first phase, followed by a beam search strategy. In the beam search phase, the algorithm uses the confidence metric to select the top k sequential patterns that will be specialized in the next level. Moreover, we develop a first-order logic classification framework that uses predicate invention technique to include the BeamSouL findings in the learning process. We evaluate the performance of our proposals using four multi-relational databases. The results are promising, and the BeamSouL algorithm can be more than one order of magnitude faster than the baseline and can find long and highly discriminative contrasting sequences.

CloseRead Abstract

2019

Discovering Common Pathways Across Users' Habits in Mobility Data

Authors
Andrade, T; Cancela, B; Gama, J;

Publication
Progress in Artificial Intelligence, 19th EPIA Conference on Artificial Intelligence, EPIA 2019, Vila Real, Portugal, September 3-6, 2019, Proceedings, Part II.

Abstract
Different activities are performed by people during the day and many aspects of life are associated with places of human mobility patterns. Among those activities, there are some that are recurrent and demand displacement of the individual between regular places like going to work, going to school, going back home from wherever the individual is located. To accomplish these recurrent daily activities, people tend to follow regular paths with similar temporal and spatial characteristics. In this paper, we propose a method for discovering common pathways across users’ habits. By using density-based clustering algorithms, we detect the users’ most preferable locations and apply a Gaussian Mixture Model (GMM) over these locations to automatically separate the trajectories that follow patterns of days and hours, in order to discover the representations of individual’s habits. Over the set of users’ habits, we search for the trajectories that are more common among them by using the Longest Common Sub-sequence (LCSS) algorithm considering the distance that pairs of users travel on the same path. To evaluate the proposed method we use a real-world GPS dataset. The results show that the method is able to find common routes between users that have similar habits paving the way for future recommendation, prediction and carpooling research techniques. © 2019, Springer Nature Switzerland AG.

CloseRead Abstract

2019

Digital Ampelographer: A CNN Based Preliminary Approach

Authors
Adão, T; Pinho, TM; Ferreira, A; Sousa, AMR; Pádua, L; Sousa, J; Sousa, JJ; Peres, E; Morais, R;

Publication
Progress in Artificial Intelligence - 19th EPIA Conference on Artificial Intelligence, EPIA 2019, Vila Real, Portugal, September 3-6, 2019, Proceedings, Part I

Abstract
Authenticity, traceability and certification are key to assure both quality and confidence to wine consumers and an added commercial value to farmers and winemakers. Grapevine variety stands out as one of the most relevant factors to be considered in wine identification within the whole wine sector value chain. Ampelography is the science responsible for grapevine varieties identification based on (i) in-situ visual inspection of grapevine mature leaves and (ii) on the ampelographer experience. Laboratorial analysis is a costly and time-consuming alternative. Both the lack of experienced professionals and context-induced error can severely hinder official regulatory authorities’ role and therefore bring about a significant impact in the value chain. The purpose of this paper is to assess deep learning potential to classify grapevine varieties through the ampelometric analysis of leaves. Three convolutional neural networks architectures performance are evaluated using a dataset composed of six different grapevine varieties leaves. This preliminary approach identified Xception architecture as very promising to classify grapevine varieties and therefore support a future autonomous tool that assists the wine sector stakeholders, particularly the official regulatory authorities. © Springer Nature Switzerland AG 2019.

CloseRead Abstract