2021
Authors
Guimaraes, M; Carneiro, D;
Publication
PROCEEDINGS OF 2021 16TH IBERIAN CONFERENCE ON INFORMATION SYSTEMS AND TECHNOLOGIES (CISTI'2021)
Abstract
Machine Learning is one of the most trending topics nowadays. The reason is of course for being more and more present in our everyday life, even if we do not notice it. What goes even more unnoticed is the fact that every Machine Learning model needs computational power. And of course, it also needs data. But how many data are necessary to build the best Machine Learning model possible, and how many times do we need to retrain a model so that it does not become obsolete as data change? That kind of questions are the ones that can reduce unnecessary costs to a company. In this paper we propose a novel approach to predict the performance of a model given some characteristics of the data, that are called meta-features. The goal is, indeed, to only train a new model when some error metric (e.g., RMSE) is expected to decrease substantially compared with a previously trained model. This approach is best applied in scenarios of data streaming or in Big Data, as well on Interactive Machine Learning scenarios. We validate it on a real Fraud Detection case and this scenario is also briefly described.
2021
Authors
Carneiro, D; Pereira, J; Silva, ECE;
Publication
NEURAL COMPUTING & APPLICATIONS
Abstract
Grapes reception is a key process in wine production. The harvest days are extremely challenging days in managing the reception of the grapes, as the winery needs to deal with the non-uniform arrival of the grapes, while guaranteeing suppliers' satisfaction and wine quality. The best management of the resources of the suppliers (i.e., grapes and trucks) and winery (i.e., grain-tanks and pressing machines) must be ensured. In this paper, the underlying optimization problem for grape reception is solved by developing a genetic algorithm (GA) tailored for this specific challenge. The results of this algorithm are compared with a FIFO policy for a typical scenario that occurs on the harvest days of a real winery. Additionally, different scenarios are simulated to assess the validity and quality of the solutions found. The results show that, using modest computational resources, it is possible to achieve better solutions with the proposed GA. This allows for the algorithm to be used in real time, even whenever plant conditions change significantly (e.g., when a new truck arrives, when a machine fails). Furthermore, the trucks and grapes waiting time for the results using the developed GA are significantly smaller than the ones observed using a FIFO approach.
2021
Authors
Carneiro, D; Guimarães, M; Carvalho, M; Novais, P;
Publication
Trends and Applications in Information Systems and Technologies - Volume 1, WorldCIST 2021, Terceira Island, Azores, Portugal, 30 March - 2 April, 2021.
Abstract
In the last years, developments in data collection, storing, processing and analysis technologies resulted in an unprecedented use of data by organizations. The volume and variety of data, combined with the velocity at which decisions must now be taken and the dynamism of business environments, pose new challenges to Machine Learning. Namely, algorithms must now deal with streaming data, concept drift, distributed datasets, among others. One common task nowadays is to update or re-train models when data changes, as opposed to traditional one-shot batch systems, in which the model is trained only once. This paper addresses the issue of when to update or re-train a model, by proposing an approach to predict the performance metrics of the model if it were trained at a given moment, with a specific set of data. We validate the proposed approach in an interactive Machine Learning system in the domain of fraud detection. © 2021, The Author(s), under exclusive license to Springer Nature Switzerland AG.
2021
Authors
Anjos Azevedo, P; Rua Carneiro, D;
Publication
Dereito: revista xurídica da Universidade de Santiago de Compostela
Abstract
2021
Authors
Santos, MC; Borges, AI; Carneiro, DR; Ferreira, FJ;
Publication
ICoMS 2021: 4th International Conference on Mathematics and Statistics, Paris, France, June 24 - 26, 2021
Abstract
Breaks in water consumption records can represent apparent losses which are generally associated with the volumes of water that are consumed but not billed. The detection of these losses at the appropriate time can have a significant economic impact on the water company's revenues. However, the real datasets available to test and evaluate the current methods on the detection of breaks are not always large enough or do not present abnormal water consumption patterns. This study proposes an approach to generate synthetic data of water consumption with structural breaks which follows the statistical proprieties of real datasets from a hotel and a hospital. The parameters of the best-fit probability distributions (gamma, Weibull, log-Normal, log-logistic, and exponential) to real water consumption data are used to generate the new datasets. Two decreasing breaks on the mean were inserted in each new dataset associated with one selected probability distribution for each study case with a time horizon of 914 days. Three different change point detection methods provided by the R packages strucchange and changepoint were evaluated making use of these new datasets. Based on Root Mean Square Error (RMSE) and Mean Absolute Error (MAE) performance indices, a higher performance has been observed for the breakpoint method provided by the package strucchange.
2020
Authors
Teymourifar, A; Rodrigues, AM; Ferreira, JS;
Publication
WSEAS TRANSACTIONS ON COMPUTERS
Abstract
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.