Publications

Publications by João Gama

2016

Online Multi-label Classification with Adaptive Model Rules

Authors
Sousa, R; Gama, J;

Publication
ADVANCES IN ARTIFICIAL INTELLIGENCE, CAEPIA 2016

Abstract
The interest on online classification has been increasing due to data streams systems growth and the need for Multi-label Classification applications have followed the same trend. However, most of classification methods are not performed on-line. Moreover, data streams produce huge amounts of data and the available processing resources may not be sufficient. This work-in-progress paper proposes an algorithm for Multi-label Classification applications in data streams scenarios. The proposed method is derived from multi-target structured regressor AMRules that produces models using subsets of output attributes (output specialization strategy). Performance tests were conducted where the operation modes global, local and subset approaches of the proposed method were compared to each other and to others online multi-label classifiers described in the literature. Three datasets of real scenarios were used for evaluation. The results indicate that the subset specialization mode is competitive in comparison to local and global approaches and to other online multi-label classifiers.

CloseRead Abstract

2013

Real-time algorithm for changes detection in depth of anesthesia signals

Authors
Sebastiao, R; Silva, MM; Rabico, R; Gama, J; Mendonca, T;

Publication
Evolving Systems

Abstract
This paper presents a real-time algorithm for changes detection in depth of anesthesia signals. A Page-Hinkley test (PHT) with a forgetting mechanism (PHT-FM) was developed. The samples are weighted according to their "age" so that more importance is given to recent samples. This enables the detection of the changes with less time delay than if no forgetting factor was used. The performance of the PHT-FM was evaluated in a two-fold approach. First, the algorithm was run offline in depth of anesthesia (DoA) signals previously collected during general anesthesia, allowing the adjustment of the forgetting mechanism. Second, the PHT-FM was embedded in a real-time software and its performance was validated online in the surgery room. This was performed by asking the clinician to classify in real-time the changes as true positives, false positives or false negatives. The results show that 69 % of the changes were classified as true positives, 26 % as false positives, and 5 % as false negatives. The true positives were also synchronized with changes in the hypnotic or analgesic rates made by the clinician. The contribution of this work has a high impact in the clinical practice since the PHT-FM alerts the clinician for changes in the anesthetic state of the patient, allowing a more prompt action. The results encourage the inclusion of the proposed PHT-FM in a real-time decision support system for routine use in the clinical practice. © 2012 Springer-Verlag.

CloseRead Abstract

2014

Using probabilistic graphical models to enhance the prognosis of health-related quality of life in adult survivors of critical illness

Authors
Dias, CC; Granja, C; Costa Pereira, A; Gama, J; Rodrigues, PP;

Publication
2014 IEEE 27TH INTERNATIONAL SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS (CBMS)

Abstract
Health-related quality of life (HR-QoL) is a subjective concept, reflecting the overall mental and physical state of the patient, and their own sense of well-being. Estimating current and future QoL has become a major outcome in the evaluation of critically ill patients. The aim of this study is to enhance the inference process of 6 weeks and 6 months prognosis of QoL after intensive care unit (ICU) stay, using the EQ-5D questionnaire. The main outcomes of the study were the EQ-5D five main dimensions: mobility, self-care, usual activities, pain and anxiety/depression. For each outcome, three Bayesian classifiers were built and validated with 10-fold cross-validation. Sixty and 473 patients (6 weeks and 6 months, respectively) were included. Overall, 6 months QoL is higher than 6 weeks, with the probability of absence of problems ranging from 31% (6 weeks mobility) to 72% (6 months self-care). Bayesian models achieved prognosis accuracies of 56% (6 months, anxiety/depression) up to 80% (6 weeks, mobility). The prognosis inference process for an individual patient was enhanced with the visual analysis of the models, showing that women, elderly, or people with longer ICU stay have higher risk of QoL problems at 6 weeks. Likewise, for the 6 months prognosis, a higher APACHE II severity score also leads to a higher risk of problems, except for anxiety/depression where the youngest and active have increased risk. Bayesian networks are competitive with less descriptive strategies, improve the inference process by incorporating domain knowledge and present a more interpretable model. The relationships among different factors extracted by the Bayesian models are in accordance with those collected by previous state-of-the-art literature, hence showing their usability as inference model.

CloseRead Abstract

2015

Visualization for streaming telecommunications networks

Authors
Sarmento, R; Cordeiro, M; Gama, J;

Publication
Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science)

Abstract
Regular services in telecommunications produce massive volumes of relational data. In this work the data produced in telecommunications is seen as a streaming network, where clients are the nodes and phone calls are the edges. Visualization techniques are required for exploratory data analysis and event detection. In social network visualization and analysis the goal is to get more information from the data taking into account actors at the individual level. Previous methods relied on aggregating communities, k-Core decompositions and matrix feature representations to visualize and analyse the massive network data. Our contribution is a group visualization and analysis technique of influential actors in the network by sampling the full network with a top-k representation of the network data stream. © Springer International Publishing 2015.

CloseRead Abstract

2017

WCDS: A Two-Phase Weightless Neural System for Data Stream Clustering

Authors
Cardoso, DO; Franca, FMG; Gama, J;

Publication
NEW GENERATION COMPUTING

Abstract
Clustering is a powerful and versatile tool for knowledge discovery, able to provide a valuable information for data analysis in various domains. To perform this task based on streaming data is quite challenging: outdated knowledge needs to be disposed while the current knowledge is obtained from fresh data; since data are continuously flowing, strict efficiency constraints have to be met. This paper presents WCDS, an approach to this problem based on the WiSARD artificial neural network model. This model already had useful characteristics as inherent incremental learning capability and patent functioning speed. These were combined with novel features as an adaptive countermeasure to cluster imbalance, a mechanism to discard expired data, and offline clustering based on a pairwise similarity measure for WiSARD discriminators. In an insightful experimental evaluation, the proposed system had an excellent performance according to multiple quality standards. This supports its applicability for the analysis of data streams.

CloseRead Abstract

2017

Weightless neural networks for open set recognition

Authors
Cardoso, DO; Gama, J; Franca, FMG;

Publication
MACHINE LEARNING

Abstract
Open set recognition is a classification-like task. It is accomplished not only by the identification of observations which belong to targeted classes (i.e., the classes among those represented in the training sample which should be later recognized) but also by the rejection of inputs from other classes in the problem domain. The need for proper handling of elements of classes beyond those of interest is frequently ignored, even in works found in the literature. This leads to the improper development of learning systems, which may obtain misleading results when evaluated in their test beds, consequently failing to keep the performance level while facing some real challenge. The adaptation of a classifier for open set recognition is not always possible: the probabilistic premises most of them are built upon are not valid in a open-set setting. Still, this paper details how this was realized for WiSARD a weightless artificial neural network model. Such achievement was based on an elaborate distance-like computation this model provides and the definition of rejection thresholds during training. The proposed methodology was tested through a collection of experiments, with distinct backgrounds and goals. The results obtained confirm the usefulness of this tool for open set recognition.

CloseRead Abstract