2024
Autores
Cerqueira, V; Moniz, N; Soares, C;
Publicação
MACHINE LEARNING
Abstract
Time series forecasting is a challenging task with applications in a wide range of domains. Auto-regression is one of the most common approaches to address these problems. Accordingly, observations are modelled by multiple regression using their past lags as predictor variables. We investigate the extension of auto-regressive processes using statistics which summarise the recent past dynamics of time series. The result of our research is a novel framework called VEST, designed to perform feature engineering using univariate and numeric time series automatically. The proposed approach works in three main steps. First, recent observations are mapped onto different representations. Second, each representation is summarised by statistical functions. Finally, a filter is applied for feature selection. We discovered that combining the features generated by VEST with auto-regression significantly improves forecasting performance in a database composed by 90 time series with high sampling frequency. However, we also found that there are no improvements when the framework is applied for multi-step forecasting or in time series with low sample size. VEST is publicly available online.
2021
Autores
Ribeiro, B; Cerqueira, V; Santos, R; Gamboa, H;
Publicação
2021 INTERNATIONAL CONFERENCE ON E-HEALTH AND BIOENGINEERING (EHB 2021), 9TH EDITION
Abstract
Precise machine learning models for the early identification of anomalies based on biosignal data retrieved from bedside monitors could improve intensive care, by helping clinicians make decisions in advance and produce on-time responses. However, traditional models show limitations when dealing with the high complexity of this task. Layered Learning (LL) emerges as a solution, as it consists of the hierarchical decomposition of the problem into simpler tasks. This paper explores the uncovered potential of LL in the early detection of Acute Hypotensive Episodes (AHEs). We leverage information from the MIMIC-III Database to test different subdivisions of the main task and study how to combine the outcomes from distinct layers. In addition to this, we also test a novel approach to reduce false positives in AHE predictions.
2021
Autores
Moniz, N; Cerqueira, V;
Publicação
EXPERT SYSTEMS WITH APPLICATIONS
Abstract
Imbalanced learning is one of the most relevant problems in machine learning. However, it faces two crucial challenges. First, the amount of methods proposed to deal with such problem has grown immensely, making the validation of a large set of methods impractical. Second, it requires specialised knowledge, hindering its use by those without such level of experience. In this paper, we propose the Automated Imbalanced Classification method, ATOMIC. Such a method is the first automated machine learning approach for imbalanced classification tasks. It provides a ranking of solutions most likely to ensure an optimal approximation to a new domain, drastically reducing associated computational complexity and energy consumption. We carry this out by anticipating the loss of a large set of predictive solutions in new imbalanced learning tasks. We compare the predictive performance of ATOMIC against state-of-the-art methods using 101 imbalanced data sets. Results demonstrate that the proposed method provides a relevant approach to imbalanced learning while reducing learning and testing efforts of candidate solutions by approximately 95%.
2020
Autores
Cerqueira, V; Gomes, HM; Bifet, A;
Publicação
Discovery Science - 23rd International Conference, DS 2020, Thessaloniki, Greece, October 19-21, 2020, Proceedings
Abstract
Concept drift detection is a crucial task in data stream evolving environments. Most of the state of the art approaches designed to tackle this problem monitor the loss of predictive models. Accordingly, an alarm is launched when the loss increases significantly, which triggers some adaptation mechanism (e.g. retrain the model). However, this modus operandi falls short in many real-world scenarios, where the true labels are not readily available to compute the loss. These often take up to several weeks to be available. In this context, there is increasing attention to approaches that perform concept drift detection in an unsupervised manner, i.e., without access to the true labels. We propose a novel approach to unsupervised concept drift detection, which is based on a student-teacher learning paradigm. Essentially, we create an auxiliary model (student) to mimic the behaviour of the main model (teacher). At run-time, our approach is to use the teacher for predicting new instances and monitoring the mimicking loss of the student for concept drift detection. In a set of controlled experiments, we discovered that the proposed approach detects concept drift effectively. Relative to the gold standard, in which the labels are immediately available after prediction, our approach is more conservative: it signals less false alarms, but it requires more time to detect changes. We also show the competitiveness of our approach relative to other unsupervised methods. © 2020, Springer Nature Switzerland AG.
2021
Autores
Costa, P; Cerqueira, V; Vinagre, J;
Publicação
CoRR
Abstract
2023
Autores
Cerqueira, V; Gomes, HM; Bifet, A; Torgo, L;
Publicação
Mach. Learn.
Abstract
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.