2020
Authors
Veloso, B; Gama, J; Martins, C; Espanha, R; Azevedo, R;
Publication
ACM SIGAPP Applied Computing Review
Abstract
2020
Authors
Veloso, B; Gama, J;
Publication
IoT Streams for Data-Driven Predictive Maintenance and IoT, Edge, and Mobile for Embedded Machine Learning - Second International Workshop, IoT Streams 2020, and First International Workshop, ITEM 2020, Co-located with ECML/PKDD 2020, Ghent, Belgium, September 14-18, 2020, Revised Selected Papers
Abstract
The new 5G mobile communication system era brings a new set of communication devices that will appear on the market. These devices will generate data streams that require proper handling by machine algorithms. The processing of these data streams requires the design, development, and adaptation of appropriate machine learning algorithms. While stream processing algorithms include hyper-parameters for performance refinement, their tuning process is time-consuming and typically requires an expert to do the task. In this paper, we present an extension of the Self Parameter Tuning (SPT) optimization algorithm for data streams. We apply the Nelder-Mead algorithm to dynamically sized samples that converge to optimal settings in a double pass over data (during the exploration phase), using a relatively small number of data points. Additionally, the SPT automatically readjusts hyper-parameters when concept drift occurs. We did a set of experiments with well-known classification data sets and the results show that the proposed algorithm can outperform the results of previous hyper-parameter tuning efforts by human experts. The statistical results show that this extension is faster in terms of convergence and presents at least similar accuracy results when compared with the standard optimization techniques. © 2020, Springer Nature Switzerland AG.
2020
Authors
Barros, M; Veloso, B; Pereira, PM; Ribeiro, RP; Gama, J;
Publication
IoT Streams for Data-Driven Predictive Maintenance and IoT, Edge, and Mobile for Embedded Machine Learning - Second International Workshop, IoT Streams 2020, and First International Workshop, ITEM 2020, Co-located with ECML/PKDD 2020, Ghent, Belgium, September 14-18, 2020, Revised Selected Papers
Abstract
The transformation of industrial manufacturing with computers and automation with smart systems leads us to monitor and log of industrial equipment events. It is possible to apply analytic approaches, and to find interpretive results for strategic decision making, providing advantages such as failure detection and predictive maintenance. Over the last years, many researchers have been studying the application of machine learning techniques to improve such tasks. In this context, we develop a system capable of detect anomalies on an Air Production Unit (APU), taking into consideration the peak frequency of each sensor. The study started with the analysis of the sensors installed on the APU, defining its normal behavior and its failure mode. Using that information, we define rules, to monitor the APU, to detect anomalies on its components, and to predict possible failures. The definition of rules was based on the peak frequency analysis, which allowed the setting of boundaries of normality for the APU working modes and, thus, the identification of anomalies. © 2020, Springer Nature Switzerland AG.
2021
Authors
Veloso, B; Gama, J; Malheiro, B;
Publication
Encyclopedia of Information Science and Technology, Fifth Edition - Advances in Information Quality and Management
Abstract
2020
Authors
Bahri, M; Veloso, B; Bifet, A; Gama, J;
Publication
2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA)
Abstract
The last few decades have witnessed a significant evolution of technology in different domains, changing the way the world operates, which leads to an overwhelming amount of data generated in an open-ended way as streams. Over the past years, we observed the development of several machine learning algorithms to process big data streams. However, the accuracy of these algorithms is very sensitive to their hyper-parameters, which requires expertise and extensive trials to tune. Another relevant aspect is the high-dimensionality of data, which can causes degradation to computational performance. To cope with these issues, this paper proposes a stream k-nearest neighbors (kNN) algorithm that applies an internal dimension reduction to the stream in order to reduce the resource usage and uses an automatic monitoring system that tunes dynamically the configuration of the kNN algorithm and the output dimension size with big data streams. Experiments over a wide range of datasets show that the predictive and computational performances of the kNN algorithm are improved.
2021
Authors
Leal, F; Veloso, B; Malheiro, B; Burguillo, JC;
Publication
Trends and Applications in Information Systems and Technologies - Volume 1, WorldCIST 2021, Terceira Island, Azores, Portugal, 30 March - 2 April, 2021.
Abstract
Crowdsourced data streams are continuous flows of data generated at high rate by users, also known as the crowd. These data streams are popular and extremely valuable in several domains. This is the case of tourism, where crowdsourcing platforms rely on tourist and business inputs to provide tailored recommendations to future tourists in real time. The continuous, open and non-curated nature of the crowd-originated data requires robust data stream mining techniques for on-line profiling, recommendation and evaluation. The sought techniques need, not only, to continuously improve profiles and learn models, but also be transparent, overcome biases, prioritise preferences, and master huge data volumes; all in real time. This article surveys the state-of-art in this field, and identifies future research opportunities. © 2021, The Author(s), under exclusive license to Springer Nature Switzerland AG.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.