Publications

Publications by Bruno Miguel Veloso

2022

The MetroPT dataset for predictive maintenance

Authors
Veloso, B; Gama, J; Ribeiro, RP; Pereira, PM;

Publication
SCIENTIFIC DATA

Abstract
The paper describes the MetroPT data set, an outcome of a Predictive Maintenance project with an urban metro public transportation service in Porto, Portugal. The data was collected in 2022 to develop machine learning methods for online anomaly detection and failure prediction. Several analog sensor signals (pressure, temperature, current consumption), digital signals (control signals, discrete signals), and GPS information (latitude, longitude, and speed) provide a framework that can be easily used and help the development of new machine learning methods. This dataset contains some interesting characteristics and can be a good benchmark for predictive maintenance models.

CloseRead Abstract

2023

Predictive Maintenance, Adversarial Autoencoders and Explainability

Authors
Silva, MEP; Veloso, B; Gama, J;

Publication
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: APPLIED DATA SCIENCE AND DEMO TRACK, ECML PKDD 2023, PT VII

Abstract
The transition to Industry 4.0 provoked a transformation of industrial manufacturing with a significant leap in automation and intelligent systems. This paradigm shift has brought about a mindset that emphasizes predictive maintenance: detecting future failures when current behaviour of industrial processes and machines is thought to be normal. The constant monitoring of industrial equipment produces massive quantities of data that enables the application of machine learning approaches to this task. This study uses deep learning-based models to build a data-driven predictive maintenance framework for the air production unit (APU), a crucial system for the proper functioning of a Metro do Porto train. This public transport system moves thousands of people every day and train failures lead to delays and loss of trust by clients. Therefore, it is essential not only to detect APU failures before they occur to minimize negative impacts, but also to provide explanations for the failure warnings that can aid in decision-making processes. We propose an autoencoder architecture trained with an adversarial loss, known as the Wasserstein Autoencoder with Generative Adversarial Network (WAE-GAN), designed to detect sensor failures in systems connected to the APU. Our model can detect APU failures up to two hours before they occur, allowing timely intervention of the maintenance teams. We further augment our model with an explainability layer, by providing explanations generated by a rule-based model that focuses on rare events. Results show that our model is able to detect APU failures without any false alarms, fulfilling the requisites of Metro do Porto for early detection of the failures.

CloseRead Abstract

2023

MetroPT-3 Dataset

Authors
Davari, N; Veloso, B; Ribeiro, RP; Gama, J;

Publication

Abstract

2024

SWINN: Efficient nearest neighbor search in sliding windows using graphs

Authors
Mastelini, SM; Veloso, B; Halford, M; de Carvalho, ACPDF; Gama, J;

Publication
INFORMATION FUSION

Abstract
Nearest neighbor search (NNS) is one of the main concerns in data stream applications since similarity queries can be used in multiple scenarios. Online NNS is usually performed on a sliding window by lazily scanning every element currently stored in the window. This paper proposes Sliding Window-based Incremental Nearest Neighbors (SWINN), a graph-based online search index algorithm for speeding up NNS in potentially never-ending and dynamic data stream tasks. Our proposal broadens the application of online NNS-based solutions, as even moderately large data buffers become impractical to handle when a naive NNS strategy is selected. SWINN enables efficient handling of large data buffers by using an incremental strategy to build and update a search graph supporting any distance metric. Vertices can be added and removed from the search graph. To keep the graph reliable for search queries, lightweight graph maintenance routines are run. According to experimental results, SWINN is significantly faster than performing a naive complete scan of the data buffer while keeping competitive search recall values. We also apply SWINN to online classification and regression tasks and show that our proposal is effective against popular online machine learning algorithms.

CloseRead Abstract

2023

Why Industry 5.0 Needs XAI 2.0?

Authors
Bobek, S; Nowaczyk, S; Gama, J; Pashami, S; Ribeiro, RP; Taghiyarrenani, Z; Veloso, B; Rajaoarisoa, LH; Szelazek, M; Nalepa, GJ;

Publication
Joint Proceedings of the xAI-2023 Late-breaking Work, Demos and Doctoral Consortium co-located with the 1st World Conference on eXplainable Artificial Intelligence (xAI-2023), Lisbon, Portugal, July 26-28, 2023.

Abstract
Advances in artificial intelligence trigger transformations that make more and more companies enter Industry 4.0 and 5.0 eras. In many cases, these transformations are gradual and performed in a bottom-up manner. This means that in the first step, the industrial hardware is upgraded to collect as much data as possible without actual planning of the utilization of the information. Furthermore, the data storage and processing infrastructure is prepared to keep large volumes of historical data accessible for further analysis. Only in the last step are methods for processing the data developed to improve or gain more insight into the industrial and business processes. Such a pipeline makes many companies face a problem with huge amounts of data, an incomplete understanding of how the existing knowledge is represented in the data, under which conditions the knowledge no longer holds, or what new phenomena are hidden inside the data. We argue that this gap needs to be addressed by the next generation of XAI methods which should be expert-oriented and focused on knowledge generation tasks rather than model debugging. The paper is based on the findings of the EU CHIST-ERA project on Explainable Predictive Maintenance (XPM). © 2023 CEUR-WS. All rights reserved.

CloseRead Abstract

2024

Improving hyper-parameter self-tuning for data streams by adapting an evolutionary approach

Authors
Moya, AR; Veloso, B; Gama, J; Ventura, S;

Publication
DATA MINING AND KNOWLEDGE DISCOVERY

Abstract
Hyper-parameter tuning of machine learning models has become a crucial task in achieving optimal results in terms of performance. Several researchers have explored the optimisation task during the last decades to reach a state-of-the-art method. However, most of them focus on batch or offline learning, where data distributions do not change arbitrarily over time. On the other hand, dealing with data streams and online learning is a challenging problem. In fact, the higher the technology goes, the greater the importance of sophisticated techniques to process these data streams. Thus, improving hyper-parameter self-tuning during online learning of these machine learning models is crucial. To this end, in this paper, we present MESSPT, an evolutionary algorithm for self-hyper-parameter tuning for data streams. We apply Differential Evolution to dynamically-sized samples, requiring a single pass-over of data to train and evaluate models and choose the best configurations. We take care of the number of configurations to be evaluated, which necessarily has to be reduced, thus making this evolutionary approach a micro-evolutionary one. Furthermore, we control how our evolutionary algorithm deals with concept drift. Experiments on different learning tasks and over well-known datasets show that our proposed MESSPT outperforms the state-of-the-art on hyper-parameter tuning for data streams.

CloseRead Abstract