About
Areas of research:
- Knowledge discovery
- Supervised learning
- Multiple predictive models
- Applied knowledge discovery
- Intelligent transportation systems
- Planning and operations of public transports
Areas of research: - Knowledge discovery Supervised learning Multiple predictive models Applied knowledge discovery - Intelligent transportation systems Planning and operations of public transports
Areas of research:
- Knowledge discovery
- Intelligent transportation systems
2024
Authors
Mendes Neves, T; Seca, D; Sousa, R; Ribeiro, C; Mendes Moreira, J;
Publication
COMPUTATIONAL ECONOMICS
Abstract
As many automated algorithms find their way into the IT systems of the banking sector, having a way to validate and interpret the results from these algorithms can lead to a substantial reduction in the risks associated with automation. Usually, validating these pricing mechanisms requires human resources to manually analyze and validate large quantities of data. There is a lack of effective methods that analyze the time series and understand if what is currently happening is plausible based on previous data, without information about the variables used to calculate the price of the asset. This paper describes an implementation of a process that allows us to validate many data points automatically. We explore the K-Nearest Neighbors algorithm to find coincident patterns in financial time series, allowing us to detect anomalies, outliers, and data points that do not follow normal behavior. This system allows quicker detection of defective calculations that would otherwise result in the incorrect pricing of financial assets. Furthermore, our method does not require knowledge about the variables used to calculate the time series being analyzed. Our proposal uses pattern matching and can validate more than 58% of instances, substantially improving human risk analysts' efficiency. The proposal is completely transparent, allowing analysts to understand how the algorithm made its decision, increasing the trustworthiness of the method.
2024
Authors
Strecht, P; Mendes Moreira, J; Soares, C;
Publication
ADVANCES IN ARTIFICIAL INTELLIGENCE, AI 2023, PT II
Abstract
A growing number of organizations are adopting a strategy of breaking down large data analysis problems into specific sub-problems, tailoring models for each. However, handling a large number of individual models can pose challenges in understanding organization-wide phenomena. Recent studies focus on using decision trees to create a consensus model by aggregating local decision trees into sets of rules. Despite efforts, the resulting models may still be incomplete, i.e., not able to cover the entire decision space. This paper explores methodologies to tackle this issue by generating complete consensus models from incomplete rule sets, relying on rough estimates of the distribution of independent variables. Two approaches are introduced: synthetic dataset creation followed by decision tree training and a specialized algorithm for creating a decision tree from symbolic data. The feasibility of generating complete decision trees is demonstrated, along with an empirical evaluation on a number of datasets.
2024
Authors
Silva, A; Mendes Moreira, J; Ferreira, C; Costa, N; Dias, D;
Publication
COMPUTERS AND ELECTRONICS IN AGRICULTURE
Abstract
In this paper, a solution to monitor the location of humans during their activity in the agriculture sector with the aim to boost productivity and efficiency is provided. Our solution is based on map-matching methods, that are used to track the path spanned by a worker along a specific activity in an agriculture culture. Two different cultures are taken into consideration in this study olives and vines. We leverage the symmetry of the geometry of these cultures into our solution and divide the problem three-fold initially, we estimate a path of a worker along the fields, then we apply the map-matching to such path and finally, a post-processing method is applied to ensure local continuity of the sequence obtained from map-matching. The proposed methods are experimentally evaluated using synthetic and real data in the region of Mirandela, Portugal. Evaluation metrics show that results for synthetic data are robust under several sampling periods, while for real-world data, results for the vine culture are on par with synthetic, and for the olive culture performance is reduced.
2024
Authors
Pedroto, M; Coelho, T; Fernandes, J; Oliveira, A; Jorge, A; Mendes Moreira, J;
Publication
AMYLOID-JOURNAL OF PROTEIN FOLDING DISORDERS
Abstract
BackgroundHereditary transthyretin amyloidosis (ATTRv amyloidosis) is an inherited disease, where the study of family history holds importance. This study evaluates the changes of age-of-onset (AOO) and other age-related clinical factors within and among families affected by ATTRv amyloidosis.MethodsWe analysed information from 934 trees, focusing on family, parents, probands and siblings relationships. We focused on 1494 female and 1712 male symptomatic ATTRV30M patients. Results are presented alongside a comparison of current with historical records. Clinical and genealogical indicators identify major changes.ResultsOverall, analysis of familial data shows the existence of families with both early and late patients (1/6). It identifies long familial follow-up times since patient families tend to be diagnosed over several years. Finally, results show a large difference between parent-child and proband-patient relationships (20-30 years).ConclusionsThis study reveals that there has been a shift in patient profile, with a recent increase in male elderly cases, especially regarding probands. It shows that symptomatic patients exhibit less variability towards siblings, when compared to other family members, namely the transmitting ancestors' age of onset. This can influence genetic counselling guidelines.
2024
Authors
Tuna, R; Baghoussi, Y; Soares, C; Mendes-Moreira, J;
Publication
ADVANCES IN INTELLIGENT DATA ANALYSIS XXII, PT II, IDA 2024
Abstract
Forecasting methods are affected by data quality issues in two ways: 1. they are hard to predict, and 2. they may affect the model negatively when it is updated with new data. The latter issue is usually addressed by pre-processing the data to remove those issues. An alternative approach has recently been proposed, Corrector LSTM (cLSTM), which is a Read & Write Machine Learning (RW-ML) algorithm that changes the data while learning to improve its predictions. Despite promising results being reported, cLSTM is computationally expensive, as it uses a meta-learner to monitor the hidden states of the LSTM. We propose a new RW-ML algorithm, Kernel Corrector LSTM (KcLSTM), that replaces the meta-learner of cLSTM with a simpler method: Kernel Smoothing. We empirically evaluate the forecasting accuracy and the training time of the new algorithm and compare it with cLSTM and LSTM. Results indicate that it is able to decrease the training time while maintaining a competitive forecasting accuracy.
Supervised Thesis
2024
Author
Tiago António Dias Costa Carvalho Mendes
Institution
UP-FEUP
2024
Author
Pedro Alexandre Teixeira Moreira
Institution
UP-FEUP
2024
Author
Mohammad Pasandidehpoor
Institution
UP-FEUP
2024
Author
Pedro Rodrigo Caetano Strecht Ribeiro
Institution
UP-FEUP
2024
Author
Rahul Kumar
Institution
UP-FEUP
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.