Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Publicações

Publicações por LIAAD

2022

Visualização da relevância relativa de investigadores a partir da sua produção textual

Autores
Trigo, L; Brazdil, P;

Publicação
Linguística: Revista de Estudos Linguísticos da Universidade do Porto

Abstract
Building a researchers affinity network through the automatic processing of their publications allows us to gain a perspective that goes beyond the networks established through co-authorship. The definition of the importance of each researcher is defined upon their bibliographic production volume, i.e., number of publications, and also upon their centrality in the general network of researchers. In fact, the centrality of a researcher in a network reveals its importance in communication flows with other researchers, thus assuming that communication between researchers is itself a relevant factor for organizational life and in its production. Both network and centrality concepts are better interpreted in a graphical way. In this study, we explore the workflow that will provide these visualizations and focus in the empirical selection of the most appropriate centrality measure. We also propose a centrality visualization method that facilitates the interpretation of the selected measures

2022

The impact of heterogeneous distance functions on missing data imputation and classification performance

Autores
Santos, MS; Abreu, PH; Fernandez, A; Luengo, J; Santos, J;

Publicação
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE

Abstract
This work performs an in-depth study of the impact of distance functions on K-Nearest Neighbours imputation of heterogeneous datasets. Missing data is generated at several percentages, on a large benchmark of 150 datasets (50 continuous, 50 categorical and 50 heterogeneous datasets) and data imputation is performed using different distance functions (HEOM, HEOM-R, HVDM, HVDM-R, HVDM-S, MDE and SIMDIST) and k values (1, 3, 5 and 7). The impact of distance functions on kNN imputation is then evaluated in terms of classification performance, through the analysis of a classifier learned from the imputed data, and in terms of imputation quality, where the quality of the reconstruction of the original values is assessed. By analysing the properties of heterogeneous distance functions over continuous and categorical datasets individually, we then study their behaviour over heterogeneous data. We discuss whether datasets with different natures may benefit from different distance functions and to what extent the component of a distance function that deals with missing values influences such choice. Our experiments show that missing data has a significant impact on distance computation and the obtained results provide guidelines on how to choose appropriate distance functions depending on data characteristics (continuous, categorical or heterogeneous datasets) and the objective of the study (classification or imputation tasks).

2022

The identification of cancer lesions in mammography images with missing pixels: analysis of morphology

Autores
Santos, JC; Abreu, PH; Santos, MS;

Publicação
2022 IEEE 9TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA)

Abstract
The quality of mammography images is essential for the diagnosis of breast cancer and image imputation has become a popular technique to overcome noise, artifacts, and missing data to aid in the diagnosis of diseases. In this paper, we assess the performance of six imputation methodologies for the reconstruction of missing pixels in different morphologies in mammography images. The images included in this study are collected from four public datasets (CBIS-DDSM, Mini-MIAS, INbreast, and CSAW) and the imputation results are evaluated through the mean absolute error (MAE) and structural similarity index measure (SSIM). This study goes beyond the traditional evaluation of imputation algorithms, analyzing imputation quality, morphology preservation and classification performance. The effects of imputation on the morphology of cancer lesions are of utmost importance since it lays the foundation for physicians to interpret and analyze the imputation results. The results show that DIP is the most promising methodology for higher missing pixel rates, morphology preservation, and classifying malignant and benign images.

2022

Brown-Sequard syndrome in a patient with spondyloarthritis after COVID-19 vaccine: a challenging differential diagnosis

Autores
Costa, R; Soares, C; Vaz, C; Bernardes, M; Tavares, M; Abreu, P;

Publicação
ARP RHEUMATOLOGY

Abstract

2021

Statistically Robust Evaluation of Stream-Based Recommender Systems

Autores
Vinagre, J; Jorge, AM; Rocha, C; Gama, J;

Publicação
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

Abstract
Online incremental models for recommendation are nowadays pervasive in both the industry and the academia. However, there is not yet a standard evaluation methodology for the algorithms that maintain such models. Moreover, online evaluation methodologies available in the literature generally fall short on the statistical validation of results, since this validation is not trivially applicable to stream-based algorithms. We propose a k-fold validation framework for the pairwise comparison of recommendation algorithms that learn from user feedback streams, using prequential evaluation. Our proposal enables continuous statistical testing on adaptive-size sliding windows over the outcome of the prequential process, allowing practitioners and researchers to make decisions in real time based on solid statistical evidence. We present a set of experiments to gain insights on the sensitivity and robustness of two statistical tests-McNemar's and Wilcoxon signed rank-in a streaming data environment. Our results show that besides allowing a real-time, fine-grained online assessment, the online versions of the statistical tests are at least as robust as the batch versions, and definitely more robust than a simple prequential single-fold approach.

2021

A Hybrid Recommender System for Improving Automatic Playlist Continuation

Autores
Gatzioura, A; Vinagre, J; Jorge, AM; Sanchez Marre, M;

Publicação
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

Abstract
Although widely used, the majority of current music recommender systems still focus on recommendations' accuracy, user preferences and isolated item characteristics, without evaluating other important factors, like the joint item selections and the recommendation moment. However, when it comes to playlist recommendations, additional dimensions, as well as the notion of user experience and perception, should be taken into account to improve recommendations' quality. In this work, HybA, a hybrid recommender system for automatic playlist continuation, that combines Latent Dirichlet Allocation and Case-Based Reasoning, is proposed. This system aims to address "similar concepts" rather than similar users. More than generating a playlist based on user requirements, like automatic playlist generation methods, HybA identifies the semantic characteristics of a started playlist and reuses the most similar past ones, to recommend relevant playlist continuations. In addition, support to beyond accuracy dimensions, like increased coherence or diverse items' discovery, is provided. To overcome the semantic gap between music descriptions and user preferences, identify playlist structures and capture songs' similarity, a graph model is used. Experiments on real datasets have shown that the proposed algorithm is able to outperform other state of the art techniques, in terms of accuracy, while balancing between diversity and coherence.

  • 69
  • 429