Publicacoes - INESC TEC

Publicações

Publicações por CESE

2021

A Meta-Learning Approach to Error Prediction

Autores
Guimaraes, M; Carneiro, D;

Publicação
PROCEEDINGS OF 2021 16TH IBERIAN CONFERENCE ON INFORMATION SYSTEMS AND TECHNOLOGIES (CISTI'2021)

Abstract
Machine Learning is one of the most trending topics nowadays. The reason is of course for being more and more present in our everyday life, even if we do not notice it. What goes even more unnoticed is the fact that every Machine Learning model needs computational power. And of course, it also needs data. But how many data are necessary to build the best Machine Learning model possible, and how many times do we need to retrain a model so that it does not become obsolete as data change? That kind of questions are the ones that can reduce unnecessary costs to a company. In this paper we propose a novel approach to predict the performance of a model given some characteristics of the data, that are called meta-features. The goal is, indeed, to only train a new model when some error metric (e.g., RMSE) is expected to decrease substantially compared with a previously trained model. This approach is best applied in scenarios of data streaming or in Big Data, as well on Interactive Machine Learning scenarios. We validate it on a real Fraud Detection case and this scenario is also briefly described.

FecharLer Abstract

2021

Optimization of the grapes reception process

Autores
Carneiro, D; Pereira, J; Silva, ECE;

Publicação
NEURAL COMPUTING & APPLICATIONS

Abstract
Grapes reception is a key process in wine production. The harvest days are extremely challenging days in managing the reception of the grapes, as the winery needs to deal with the non-uniform arrival of the grapes, while guaranteeing suppliers' satisfaction and wine quality. The best management of the resources of the suppliers (i.e., grapes and trucks) and winery (i.e., grain-tanks and pressing machines) must be ensured. In this paper, the underlying optimization problem for grape reception is solved by developing a genetic algorithm (GA) tailored for this specific challenge. The results of this algorithm are compared with a FIFO policy for a typical scenario that occurs on the harvest days of a real winery. Additionally, different scenarios are simulated to assess the validity and quality of the solutions found. The results show that, using modest computational resources, it is possible to achieve better solutions with the proposed GA. This allows for the algorithm to be used in real time, even whenever plant conditions change significantly (e.g., when a new truck arrives, when a machine fails). Furthermore, the trucks and grapes waiting time for the results using the developed GA are significantly smaller than the ones observed using a FIFO approach.

FecharLer Abstract

2021

Optimizing Model Training in Interactive Learning Scenarios

Autores
Carneiro, D; Guimarães, M; Carvalho, M; Novais, P;

Publicação
Trends and Applications in Information Systems and Technologies - Volume 1, WorldCIST 2021, Terceira Island, Azores, Portugal, 30 March - 2 April, 2021.

Abstract
In the last years, developments in data collection, storing, processing and analysis technologies resulted in an unprecedented use of data by organizations. The volume and variety of data, combined with the velocity at which decisions must now be taken and the dynamism of business environments, pose new challenges to Machine Learning. Namely, algorithms must now deal with streaming data, concept drift, distributed datasets, among others. One common task nowadays is to update or re-train models when data changes, as opposed to traditional one-shot batch systems, in which the model is trained only once. This paper addresses the issue of when to update or re-train a model, by proposing an approach to predict the performance metrics of the model if it were trained at a given moment, with a specific set of data. We validate the proposed approach in an interactive Machine Learning system in the domain of fraud detection. © 2021, The Author(s), under exclusive license to Springer Nature Switzerland AG.

FecharLer Abstract

2021

A RELEVÂNCIA DO PORTAL BASE, À LUZ DOS PRINCÍPIOS FUNDAMENTAIS DA CONTRATAÇÃO PÚBLICA E DO PROCEDIMENTO DE FORMAÇÃO DOS CONTRATOS PÚBLICOS EM PORTUGAL

Autores
Anjos Azevedo, P; Rua Carneiro, D;

Publicação
Dereito: revista xurídica da Universidade de Santiago de Compostela

Abstract
Resumo Na formação e execução dos contratos públicos devem ser respeitados os princípios da legalidade, prossecução do interesse público, imparcialidade, proporcionalidade, boa-fé, tutela da confiança, sustentabilidade e responsabilidade, concorrência, publicidade e transparência, igualdade de tratamento e não-discriminação. O procedimento de formação de contratos constitui a sucessão ordenada de atos que concorrem para a formação, a conclusão e a produção de uma plena eficácia jurídica de um contrato público. O legislador define os momentos que constituem a tramitação do procedimento, numa lógica de transparência, garantindo a imparcialidade e a igualdade de tratamento e de acesso ao procedimento e a adequação procedimental. O objetivo principal do portal Base é divulgar informação sobre os contratos públicos celebrados em Portugal sujeitos ao regime do Código dos Contratos Públicos. Para dar cumprimento a este objetivo, o portal constitui-se como uma ferramenta tecnológica que centraliza, num espaço virtual, informações referentes à formação e execução dos contratos públicos. Palavras-Chave: Portal Base; princípios; contratação pública; procedimento; contratos públicos

FecharLer Abstract

2021

Synthetic dataset to study breaks in the consumer's water consumption patterns

Autores
Santos, MC; Borges, AI; Carneiro, DR; Ferreira, FJ;

Publicação
ICoMS 2021: 4th International Conference on Mathematics and Statistics, Paris, France, June 24 - 26, 2021

Abstract
Breaks in water consumption records can represent apparent losses which are generally associated with the volumes of water that are consumed but not billed. The detection of these losses at the appropriate time can have a significant economic impact on the water company's revenues. However, the real datasets available to test and evaluate the current methods on the detection of breaks are not always large enough or do not present abnormal water consumption patterns. This study proposes an approach to generate synthetic data of water consumption with structural breaks which follows the statistical proprieties of real datasets from a hotel and a hospital. The parameters of the best-fit probability distributions (gamma, Weibull, log-Normal, log-logistic, and exponential) to real water consumption data are used to generate the new datasets. Two decreasing breaks on the mean were inserted in each new dataset associated with one selected probability distribution for each study case with a time horizon of 914 days. Three different change point detection methods provided by the R packages strucchange and changepoint were evaluated making use of these new datasets. Based on Root Mean Square Error (RMSE) and Mean Absolute Error (MAE) performance indices, a higher performance has been observed for the breakpoint method provided by the package strucchange.

FecharLer Abstract

2020

Geographically Separating Sectors in Multi-Objective Location-RoutingProblems

Autores
Teymourifar, A; Rodrigues, AM; Ferreira, JS;

Publicação
WSEAS TRANSACTIONS ON COMPUTERS

Abstract
This paper deals with multi-objective location-routing problems (MO-LRPs) and follows a sectorizationapproach, which means customers are divided into different sectors, and a distribution centre is opened for eachsector. The literature has considered objectives such as minimizing the number of opened distribution centres,the variances of compactness, distances and demands in sectors. However, the achievement of these objectivescannot guarantee the geographical separation of sectors. In this sense, and as the geographical separation ofsectors can have significant practical relevance, we propose a new objective function and solve a benchmarkof problems with the non-dominated sorting genetic algorithm (NSGA-II), which finds multiple non-dominatedsolutions. A comparison of the results shows the effectiveness of the introduced objective function, since, in thenon-dominated solutions obtained, the sectors are more geographically separated when the values of the objectivefunction improve.

FecharLer Abstract