Publications

Publications by Rita Paula Ribeiro

2023

Fault Detection in Wastewater Treatment Plants: Application of Autoencoders Models with Streaming Data

Authors
Salles, R; Mendes, J; Ribeiro, RP; Gama, J;

Publication
MACHINE LEARNING AND PRINCIPLES AND PRACTICE OF KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2022, PT I

Abstract
Water is a fundamental human resource and its scarcity is reflected in social, economic and environmental problems. Water used in human activities must be treated before reusing or returning to nature. This treatment takes place in wastewater treatment plants (WWTPs), which need to perform their functions with high quality, low cost, and reduced environmental impact. This paper aims to identify failures in real-time, using streaming data to provide the necessary preventive actions to minimize damage to WWTPs, heavy fines and, ultimately, environmental hazards. Convolutional and Long short-term memory (LSTM) autoencoders (AEs) were used to identify failures in the functioning of the dissolved oxygen sensor used in WWTPs. Five faults were considered (drift, bias, precision degradation, spike and stuck) in three different scenarios with variations in the appearance order, intensity and duration of the faults. The best performance, considering different model configurations, was achieved by Convolutional-AE.

CloseRead Abstract

2022

MetroPT2: A Benchmark dataset for predictive maintenance

Authors
Veloso, B; Gama, J; Ribeiro, RP; Pereira, P;

Publication

Abstract

2023

Multimodal Classification of Anxiety Based on Physiological Signals

Authors
Vaz, M; Summavielle, T; Sebastiao, R; Ribeiro, RP;

Publication
APPLIED SCIENCES-BASEL

Abstract
Multiple studies show an association between anxiety disorders and dysregulation in the Autonomic Nervous System (ANS). Thus, understanding how informative the physiological signals are would contribute to effectively detecting anxiety. This study targets the classification of anxiety as an imbalanced binary classification problem using physiological signals collected from a sample of healthy subjects under a neutral condition. For this purpose, the Electrocardiogram (ECG), Electrodermal Activity (EDA), and Electromyogram (EMG) signals from the WESAD publicly available dataset were used. The neutral condition was collected for around 20 min on 15 participants, and anxiety scores were assessed through the shortened 6-item STAI. To achieve the described goal, the subsequent steps were followed: signal pre-processing; feature extraction, analysis, and selection; and classification of anxiety. The findings of this study allowed us to classify anxiety with discriminatory class features based on physiological signals. Moreover, feature selection revealed that ECG features play a relevant role in anxiety classification. Supervised feature selection and data balancing techniques, especially Borderline SMOTE 2, increased the performance of most classifiers. In particular, the combination of feature selection and Borderline SMOTE 2 achieved the best ROC-AUC with the Random Forest classifier.

CloseRead Abstract

2023

XAI for Predictive Maintenance

Authors
Gama, J; Nowaczyk, S; Pashami, S; Ribeiro, RP; Nalepa, GJ; Veloso, B;

Publication
PROCEEDINGS OF THE 29TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2023

Abstract
The field of Explainable Predictive Maintenance (PM) is concerned with developing methods that can clarify how AI systems operate in the PM domain. One of the challenges of creating maintenance plans is integrating AI output with human decision-making processes and expertise. For AI to be helpful and trustworthy, fault predictions must be contextualized and easily comprehensible to humans. This involves providing tailored explanations to different actors depending on their roles and needs. For example, engineers can be connected to technical installation blueprints, while managers can evaluate system downtime costs, and lawyers can assess safety-threatening failures' potential liability. In many industries, black-box AI systems analyze sensor data to predict failures by detecting anomalies and deviations from typical behavior with impressive accuracy. However, PM is just one part of a broader context that aims to identify the most probable causes, develop a recovery plan, and estimate remaining useful life while providing alternative solutions. Achieving this requires complex interactions among various actors in industrial and decision-making processes. Our tutorial explores current trends, promising research directions in Explainable AI (XAI) relevant to Explainable Predictive Maintenance (XPM), and future challenges and open issues on this topic. We will also present three case studies that highlight XPM's challenges in bus and train operations and steel factories.

CloseRead Abstract

2008

A comparative study on predicting algae blooms in Douro River, Portugal

Authors
Ribeiro, R; Torgo, L;

Publication
ECOLOGICAL MODELLING

Abstract
Algae blooms are ecological events associated with extremely high abundance value of certain algae. These rare events have a strong impact in the river's ecosystem. In this context, the prediction of such events is of special importance. This paper addresses the problems that result from evaluating and comparing models at the prediction of rare extreme values using standard evaluation statistics. In this context, we describe a new evaluation statistic that we have proposed in Torgo and Ribeiro [Torgo, L., Ribeiro, R., 2006. Predicting rare extreme values. In: Ng, W, Kitsuregawa, M., Li, J., Chang, K. (Eds.), Proceedings of the loth Pacific-Asia Conference on Knowledge Discover and Data Mining (PAKDD'2006). Springer, pp. 816-820 (number 3918 in LNAI)], which can be used to identify the best models for predicting algae blooms. We apply this new statistic in a comparative study involving several models for predicting the abundance of different groups of phytoplankton in water samples collected in Douro River, Porto, Portugal. Results show that the proposed statistic identifies a variant of a Support Vector Machine as outperforming the other models that were tried in the prediction of algae blooms.

CloseRead Abstract

2012

Towards Utility Maximization in Regression

Authors
Ribeiro, RP;

Publication
12TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW 2012)

Abstract
Utilitybased learning is a key technique for addressing many real world data mining applications, where the costs/benefits are not uniform across the domain of the target variable. Still, most of the existing research has been focused on classification problems. In this paper we address a related problem. There are many relevant domains (e. g. ecological, meteorological, finance) where decisions are based on the forecast of a numeric quantity (i.e. the result of a regression model). The goal of the work on this paper is to present an evaluation framework for applications where the numeric outcome of a regression model may lead to different costs/benefits as a consequence of the actions it entails. The new metric provides a more informed estimate of the utility of any regression model, given the application-specific preference biases, and hence makes more reliable the comparison and selection between alternative regression models. We illustrate the objective of our evaluation methodology on a real-life application and also carry a set of experiments over a subset of our target regression tasks: the prediction of rare and extreme values. Results show the effectiveness of our proposed utility metric for identifying the models that perform better on this type of applications.

CloseRead Abstract