Publications

Publications by Bruno Miguel Veloso

2021

Hyperparameter self-tuning for data streams

Authors
Veloso, B; Gama, J; Malheiro, B; Vinagre, J;

Publication
INFORMATION FUSION

Abstract
The number of Internet of Things devices generating data streams is expected to grow exponentially with the support of emergent technologies such as 5G networks. Therefore, the online processing of these data streams requires the design and development of suitable machine learning algorithms, able to learn online, as data is generated. Like their batch-learning counterparts, stream-based learning algorithms require careful hyperparameter settings. However, this problem is exacerbated in online learning settings, especially with the occurrence of concept drifts, which frequently require the reconfiguration of hyperparameters. In this article, we present SSPT, an extension of the Self Parameter Tuning (SPT) optimisation algorithm for data streams. We apply the Nelder-Mead algorithm to dynamically-sized samples, converging to optimal settings in a single pass over data while using a relatively small number of hyperparameter configurations. In addition, our proposal automatically readjusts hyperparameters when concept drift occurs. To assess the effectiveness of SSPT, the algorithm is evaluated with three different machine learning problems: recommendation, regression, and classification. Experiments with well-known data sets show that the proposed algorithm can outperform previous hyperparameter tuning efforts by human experts. Results also show that SSPT converges significantly faster and presents at least similar accuracy when compared with the previous double-pass version of the SPT algorithm.

CloseRead Abstract

2021

Improving Student Engagement With Project-Based Learning: A Case Study in Software Engineering

Authors
Morais, P; Ferreira, MJ; Veloso, B;

Publication
IEEE REVISTA IBEROAMERICANA DE TECNOLOGIAS DEL APRENDIZAJE-IEEE RITA

Abstract
In the area of Information and Communication Technologies, in addition to the problem of engagement, students often have difficulties in learning subjects related to modeling and programming. The reasons for these difficulties are well known and described in the literature, pointing to difficulties in abstraction and logic thinking. Knowing that the value of flexible and personalized learning, teachers are changing the way they teach, using different active learning methodologies, such as flipped classroom, project-based learning, and peer instruction. This paper describes an experiment conducted to improve the learning experiences of the students enrolled in the Computer Science bachelor's degree course, attending three curricular units: Information Systems Development, Data Structures, and Web Languages and Technologies. The approach followed by the teachers used project-based learning as an active learning methodology. This methodology allows us to achieve four main objectives: (i) improve student engagement; (ii) improve learning outcomes achievement (iii) increase the course success rate and (iv) allow students to experience the need for the software development lifecycle, feeling that software engineering is not a block-based process but depending on previous activity, often leads to the need to go back in the process. The results obtained with the use of the active methodology were well accepted by the students and allowed both teachers and students to reach the objectives set.

CloseRead Abstract

2021

A Survey on Data-Driven Predictive Maintenance for the Railway Industry

Authors
Davari, N; Veloso, B; Costa, GD; Pereira, PM; Ribeiro, RP; Gama, J;

Publication
SENSORS

Abstract
In the last few years, many works have addressed Predictive Maintenance (PdM) by the use of Machine Learning (ML) and Deep Learning (DL) solutions, especially the latter. The monitoring and logging of industrial equipment events, like temporal behavior and fault events-anomaly detection in time-series-can be obtained from records generated by sensors installed in different parts of an industrial plant. However, such progress is incipient because we still have many challenges, and the performance of applications depends on the appropriate choice of the method. This article presents a survey of existing ML and DL techniques for handling PdM in the railway industry. This survey discusses the main approaches for this specific application within a taxonomy defined by the type of task, employed methods, metrics of evaluation, the specific equipment or process, and datasets. Lastly, we conclude and outline some suggestions for future research.

CloseRead Abstract

2021

Predictive maintenance based on anomaly detection using deep learning for air production unit in the railway industry

Authors
Davari, N; Veloso, B; Ribeiro, RP; Pereira, PM; Gama, J;

Publication
2021 IEEE 8TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA)

Abstract
Predictive maintenance methods assist early detection of failures and errors in machinery before they reach critical stages. This study proposes a data-driven predictive maintenance framework for the air production unit (APU) system of a train of Metro do Porto by deep learning based on a sparse autoencoder (SAE) network that efficiently detects abnormal data and considerably reduces the false alarm rate. Several analog and digital sensors installed on the APU system allow the detection of behavioral changes and deviations from the normal pattern by analyzing the collected data. We implemented two versions of the SAE network in which we inputted analog sensors data and digital sensors data, and the experimental results show that the failures due to air leakage problems are predicted by analog sensors data while other types of failures are identified by digital sensors data. A low pass filter is applied to the output of the SAE network, and a sequence of abnormal data is used as an alarm for the APU system failure. Performance indicators of the SAE network with digital sensors data, in terms of F1 Score, Recall, and Precision, are respectively, about 33.6%, 42%, and 28% better than those of the SAE network with analog sensors data. For comparison purposes, we also implemented a variational autoencoder (VAE). The results show that SAE performance is better than that of VAE by 14%, 77%, and 37% respectively, for Recall, Precision and F1 Score.

CloseRead Abstract

2021

Current Trends in Learning from Data Streams

Authors
Gama, J; Veloso, B; Aminian, E; Ribeiro, RP;

Publication
9TH INTERNATIONAL CONFERENCE ON BIG DATA ANALYTICS, BDA 2021

Abstract
This article presents our recent work on the topic of learning from data streams. We focus on emerging topics, including fraud detection, learning from rare cases, and hyper-parameter tuning for streaming data. © 2021, Springer Nature Switzerland AG.

CloseRead Abstract

2021

Hyper-parameter Optimization for Latent Spaces

Authors
Veloso, B; Caroprese, L; Konig, M; Teixeira, S; Manco, G; Hoos, HH; Gama, J;

Publication
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2021: RESEARCH TRACK, PT III

Abstract
We present an online optimization method for time-evolving data streams that can automatically adapt the hyper-parameters of an embedding model. More specifically, we employ the Nelder-Mead algorithm, which uses a set of heuristics to produce and exploit several potentially good configurations, from which the best one is selected and deployed. This step is repeated whenever the distribution of the data is changing. We evaluate our approach on streams of real-world as well as synthetic data, where the latter is generated in such way that its characteristics change over time (concept drift). Overall, we achieve good performance in terms of accuracy compared to state-of-the-art AutoML techniques.

CloseRead Abstract