Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by LIAAD

2021

Micro-MetaStream: Algorithm selection for time-changing data

Authors
Rossi, ALD; Soares, C; de Souza, BF; de Carvalho, ACPDF;

Publication
INFORMATION SCIENCES

Abstract
Data stream mining needs to deal with scenarios where data distribution can change over time. As a result, different learning algorithms can be more suitable in different time periods. This paper proposes micro-MetaStream, a meta-learning based method to recommend the most suitable learning algorithm for each new example arriving in a data stream. It is an evolution of MetaStream, which recommends learning algorithms for batches of examples. By using a unitary granularity, micro-MetaStream is able to respond more efficiently to changes in data distribution than its predecessor. The meta-data combines meta-features, characteristics describing recent data, with base-level features, the original variables of the new example. In experiments on real-world regression data streams, micro-metaStream outperformed MetaStream and a baseline method at the meta-level and frequently improved the predictive performance at the base-level.

2021

Discovery Science - 24th International Conference, DS 2021, Halifax, NS, Canada, October 11-13, 2021, Proceedings

Authors
Soares, C; Torgo, L;

Publication
DS

Abstract

2021

Empirical Study on the Impact of Different Sets of Parameters of Gradient Boosting Algorithms for Time-Series Forecasting with LightGBM

Authors
Barros, F; Cerqueira, V; Soares, C;

Publication
PRICAI 2021: Trends in Artificial Intelligence - 18th Pacific Rim International Conference on Artificial Intelligence, PRICAI 2021, Hanoi, Vietnam, November 8-12, 2021, Proceedings, Part I

Abstract
LightGBM has proven to be an effective forecasting algorithm by winning the M5 forecasting competition. However, given the sensitivity of LightGBM to hyperparameters, it is likely that their default values are not optimal. This work aims to answer whether it is essential to tune the hyperparameters of LightGBM to obtain better accuracy in time series forecasting and whether it can be done efficiently. Our experiments consisted of the collection and processing of data as well as hyperparameters generation and finally testing. We observed that on the 58 time series tested, the mean squared error is reduced by a maximum of 17.45% when using randomly generated configurations in contrast to using the default one. Additionally, the study of the individual hyperparameters’ performance was done. Based on the results obtained, we propose an alternative set of default LightGBM hyperparameter values to be used whilst using time series data for forecasting. © 2021, Springer Nature Switzerland AG.

2021

Promoting Fairness through Hyperparameter Optimization

Authors
Cruz, AF; Saleiro, P; Belem, C; Soares, C; Bizarro, P;

Publication
2021 21ST IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2021)

Abstract
Considerable research effort has been guided towards algorithmic fairness but real-world adoption of bias reduction techniques is still scarce. Existing methods are either metric- or model-specific, require access to sensitive attributes at inference time, or carry high development or deployment costs. This work explores the unfairness that emerges when optimizing ML models solely for predictive performance, and how to mitigate it with a simple and easily deployed intervention: fairness-aware hyperparameter optimization (HO). We propose and evaluate fairness-aware variants of three popular HO algorithms: Fair Random Search, Fair TPE, and Fairband. We validate our approach on a real-world bank account opening fraud casestudy, as well as on three datasets from the fairness literature. Results show that, without extra training cost, it is feasible to find models with 111% mean fairness increase and just 6% decrease in performance when compared with fairness-blind HO.(1)

2021

Machine Learning Informed Decision-Making with Interpreted Model's Outputs: A Field Intervention

Authors
Zejnilovic L.; Lavado S.; Soares C.; de Rituerto De Troya Í.M.; Bell A.; Ghani R.;

Publication
81st Annual Meeting of the Academy of Management 2021: Bringing the Manager Back in Management, AoM 2021

Abstract
Despite having set the theoretical ground for explainable systems decades ago, the information system scholars have given little attention to new developments in the decision-making with humans-in-the-loop in real-world problems. We take the sociotechnical system lenses and employ mixed-method analysis of a field intervention to study the machine-learning informed decision-making with interpreted models' outputs. Contrary to theory, our results suggest a small positive effect of explanations on confidence in the final decision, and a negligible effect on the decisions' quality. We uncover complex dynamic interactions between humans and algorithms, and the interplay of algorithmic aversion, trust, experts' heuristic, and changing uncertainty-resolving condititions.

2021

Inmplode: A framework to interpret multiple related rule-based models

Authors
Strecht, P; Mendes Moreira, J; Soares, C;

Publication
EXPERT SYSTEMS

Abstract
There is a growing trend to split problems into separate subproblems and develop separate models for each (e.g., different churn models for separate customer segments; different failure prediction models for separate university courses, etc.). While it may lead to better predictive models, the use of multiple models makes interpretability more challenging. In this paper, we address the problem of synthesizing the knowledge contained in a set of models without a significant loss of prediction performance. We focus on decision tree models because their interpretability makes them suitable for problems involving knowledge extraction. We detail the process, identifying alternative methods to address the different phases involved. An extensive set of experiments is carried out on the problem of predicting the failure of students in courses at the University of Porto. We assess the effect of using different methods for the operations of the methodology, both in terms of the knowledge extracted as well as the accuracy of the combined models.

  • 74
  • 429