Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by João Mendes Moreira

2022

Graph Multi-Head Convolution for Spatio-Temporal Attention in Origin Destination Tensor Prediction

Authors
Bhanu, M; Kumar, R; Roy, S; Mendes-Moreira, J; Chandra, J;

Publication
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2022, PT I

Abstract
Capturing complex spatio-temporal features of thousands of correlated taxi-demand time-series in the city makes the traffic flow prediction problem a challenging task. Hence, several Deep Neural Network (DNN) models have been developed to mimic the latent spatio-temporal behaviour of taxi-demand time-series in a city to improve the prediction results. Despite, good performance of recent DNN based traffic prediction techniques, such models can only identify either adjacent or connected regions with direct or transitive connection; hence they fail to capture spatio-temporal correlation among regions that exhibit implicit or latent connection. Additionally, the dependency of the recent DNN models on recursive components facilitates error propagation during feature aggregation without any counter strategy for it. In view of these existing glitches, we introduce a novel DNN model, graph Multi-Head Convolution for patio-Temporal Aggregation (gMHC-STA) which supports capturing spatio-temporal correlation among regions with explicit and implicit connection both. Moreover, gMHC-STA aggregates both spatial and temporal characteristics using multi-head attention; thus overriding recursive RNN or its variant approach to prevent noise propagation. The experimental results of gMHC-STA on two real-world city taxi-demand datasets report minimum of 6.5-10% improvement over the best state-of-the-art on standard benchmark metric in varying experimental conditions.

2022

Improving the Prediction of Age of Onset of TTR-FAP Patients Using Graph-Embedding Features

Authors
Pedroto, M; Jorge, A; Mendes Moreira, J; Coelho, T;

Publication
PROGRESS IN ARTIFICIAL INTELLIGENCE, EPIA 2022

Abstract
Transthyretin Familial Amyloid Polyneuropathy (TTR-FAP) is a neurological genetic illness that inflicts severe symptoms after the onset occurs. Age of onset represents the moment a patient starts to experience the symptoms of a disease. An accurate prediction of this event can improve clinical and operational guidelines that define the work of doctors, nurses, and operational staff. In this work, we transform family trees into compact vectors, that is, embeddings, and handle these as input features to predict the age of onset of patients with TTR-FAP. Our purpose is to evaluate how information present in genealogical trees can be transformed and used to improve a regression-based setting for TTR-FAP age of onset prediction. Our results show that by combining manual and graph-embeddings features there is a decrease in the mean prediction error when there is less information regarding a patient's family. With this work, we open the way for future work in representation learning for genealogical data, enabling a more effective exploitation of machine learning approaches.

2022

Density Estimation in High-Dimensional Spaces: A Multivariate Histogram Approach

Authors
Strecht, P; Mendes Moreira, J; Soares, C;

Publication
ADVANCED DATA MINING AND APPLICATIONS, ADMA 2022, PT II

Abstract
Density estimation is an important tool for data analysis. Non-parametric approaches have a reputation for offering state-of-the-art density estimates limited to few dimensions. Despite providing less accurate density estimates, histogram-based approaches remain the only alternative for datasets in high-dimensional spaces. In this paper, we present a multivariate histogram approach to estimate the density of a dataset without restrictions on the number of dimensions, containing both numerical and categorical variables (without numerical encoding) and allowing missing data (without the need to preprocess them). Results from the empirical evaluation show that it is possible to estimate the density of datasets without restrictions on dimensionality, and the method is robust to missing values and categorical variables.

2021

Inmplode: A framework to interpret multiple related rule-based models

Authors
Strecht, P; Mendes Moreira, J; Soares, C;

Publication
EXPERT SYSTEMS

Abstract
There is a growing trend to split problems into separate subproblems and develop separate models for each (e.g., different churn models for separate customer segments; different failure prediction models for separate university courses, etc.). While it may lead to better predictive models, the use of multiple models makes interpretability more challenging. In this paper, we address the problem of synthesizing the knowledge contained in a set of models without a significant loss of prediction performance. We focus on decision tree models because their interpretability makes them suitable for problems involving knowledge extraction. We detail the process, identifying alternative methods to address the different phases involved. An extensive set of experiments is carried out on the problem of predicting the failure of students in courses at the University of Porto. We assess the effect of using different methods for the operations of the methodology, both in terms of the knowledge extracted as well as the accuracy of the combined models.

2022

Tracking Data Visual Representations for Sports Broadcasting Enrichment

Authors
Couceiro, M; Lima, IR; Ulisses, A; Neves, TM; Moreira, JM;

Publication
Proceedings of the 10th International Conference on Sport Sciences Research and Technology Support, icSPORTS 2022, Valletta, Malta, October 27-28, 2022.

Abstract
The broadcast of audio-video sports content is a field with increasingly larger audiences demanding higher quality content and involvement. This growth creates the necessity to develop more content to engage the users and keep this trend. Otherwise, it may stall or even diminish. Therefore, enhancing the user experience, engagement, and involvement during live sports event broadcasts is of utmost importance. This paper proposes a solution to extract event’s information from video, resorting to Computer Vision techniques and Deep Learning algorithms. More specifically, the project encompassed the definition and implementation of field registration, object detection and tracking tasks. Focusing on football sports events, a novel dataset combining several video sources was created and used for analysis and metadata extraction. In particular, the proposed solution can detect and track players with acceptable precision using state-of-the-art methods, like YOLOv5 and DeepSORT. Furthermore, resorting to unsupervised learning techniques, the system provides team segmentation based on the colour of the players’ kits. A series of visual representations regarding the players’ movements on the field enables broadcast enrichment and increased user experience. The presented solution is framed in the H2020 DataCloud project and will be deployed in a cloud environment simplifying its access and utilisation. Copyright © 2022 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved.

2023

DyGCN-LSTM: A dynamic GCN-LSTM based encoder-decoder framework for multistep traffic prediction

Authors
Kumar, R; Moreira, JM; Chandra, J;

Publication
APPLIED INTELLIGENCE

Abstract
Intelligent transportation systems (ITS) are gaining attraction in large cities for better traffic management. Traffic forecasting is an important part of ITS, but a difficult one due to the intricate spatiotemporal relationships of traffic between different locations. Despite the fact that remote or far sensors may have temporal and spatial similarities with the predicting sensor, existing traffic forecasting research focuses primarily on modeling correlations between neighboring sensors while disregarding correlations between remote sensors. Furthermore, existing methods for capturing spatial dependencies, such as graph convolutional networks (GCNs), are unable to capture the dynamic spatial dependence in traffic systems. Self-attention-based techniques for modeling dynamic correlations of all sensors currently in use overlook the hierarchical features of roads and have quadratic computational complexity. Our paper presents a new Dynamic Graph Convolution LSTM Network (DyGCN-LSTM) to address the aforementioned limitations. The novelty of DyGCN-LSTM is that it can model the underlying non-linear spatial and temporal correlations of remotely located sensors at the same time. Experimental investigations conducted using four real-world traffic data sets show that the suggested approach is superior to state-of-the-art benchmarks by 25% in terms of RMSE.

  • 13
  • 19