2010
Authors
Ohashi, O; Torgo, L; Ribeiro, RP;
Publication
ECAI 2010 - 19TH EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE
Abstract
The current quality control methodology adopted by the water distribution service provider in the metropolitan region of Porto - Portugal, is based on simple heuristics and empirical knowledge. Based on the domain complexity and data volume, this application is a perfect candidate to apply data mining process. In this paper, we propose a new methodology to predict the range of normality for the values of different water quality parameters. These intervals of normality are of key importance to decide on costly inspection activities. Our experimental evaluation confirms that our proposal achieves good results on the task of forecasting the normal distribution of values for the following 30 days. The proposed method can be applied to other domains with similar network monitoring objectives.
2023
Authors
Pashami, S; Nowaczyk, S; Fan, Y; Jakubowski, J; Paiva, N; Davari, N; Bobek, S; Jamshidi, S; Sarmadi, H; Alabdallah, A; Ribeiro, RP; Veloso, B; Mouchaweh, MS; Rajaoarisoa, LH; Nalepa, GJ; Gama, J;
Publication
CoRR
Abstract
2023
Authors
Tome, ES; Ribeiro, RP; Dutra, I; Rodrigues, A;
Publication
SENSORS
Abstract
The early detection of fire is of utmost importance since it is related to devastating threats regarding human lives and economic losses. Unfortunately, fire alarm sensory systems are known to be prone to failures and frequent false alarms, putting people and buildings at risk. In this sense, it is essential to guarantee smoke detectors' correct functioning. Traditionally, these systems have been subject to periodic maintenance plans, which do not consider the state of the fire alarm sensors and are, therefore, sometimes carried out not when necessary but according to a predefined conservative schedule. Intending to contribute to designing a predictive maintenance plan, we propose an online data-driven anomaly detection of smoke sensors that model the behaviour of these systems over time and detect abnormal patterns that can indicate a potential failure. Our approach was applied to data collected from independent fire alarm sensory systems installed with four customers, from which about three years of data are available. For one of the customers, the obtained results were promising, with a precision score of 1 with no false positives for 3 out of 4 possible faults. Analysis of the remaining customers' results highlighted possible reasons and potential improvements to address this problem better. These findings can provide valuable insights for future research in this area.
2022
Authors
Jesus, SM; Pombal, J; Alves, D; Cruz, AF; Saleiro, P; Ribeiro, RP; Gama, J; Bizarro, P;
Publication
NeurIPS
Abstract
2022
Authors
Silva, A; Ribeiro, RP; Moniz, N;
Publication
DISCOVERY SCIENCE (DS 2022)
Abstract
Imbalanced domain learning aims to produce accurate models in predicting instances that, though underrepresented, are of utmost importance for the domain. Research in this field has been mainly focused on classification tasks. Comparatively, the number of studies carried out in the context of regression tasks is negligible. One of the main reasons for this is the lack of loss functions capable of focusing on minimizing the errors of extreme (rare) values. Recently, an evaluation metric was introduced: Squared Error Relevance Area (SERA). This metric posits a bigger emphasis on the errors committed at extreme values while also accounting for the performance in the overall target variable domain, thus preventing severe bias. However, its effectiveness as an optimization metric is unknown. In this paper, our goal is to study the impacts of using SERA as an optimization criterion in imbalanced regression tasks. Using gradient boosting algorithms as proof of concept, we perform an experimental study with 36 data sets of different domains and sizes. Results show that models that used SERA as an objective function are practically better than the models produced by their respective standard boosting algorithms at the prediction of extreme values. This confirms that SERA can be embedded as a loss function into optimization-based learning algorithms for imbalanced regression scenarios.
2022
Authors
Veloso, B; Gama, J; Ribeiro, RP; Pereira, PM;
Publication
SCIENTIFIC DATA
Abstract
The paper describes the MetroPT data set, an outcome of a Predictive Maintenance project with an urban metro public transportation service in Porto, Portugal. The data was collected in 2022 to develop machine learning methods for online anomaly detection and failure prediction. Several analog sensor signals (pressure, temperature, current consumption), digital signals (control signals, discrete signals), and GPS information (latitude, longitude, and speed) provide a framework that can be easily used and help the development of new machine learning methods. This dataset contains some interesting characteristics and can be a good benchmark for predictive maintenance models.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.