Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Publicações

Publicações por LIAAD

2023

Predicting Age from Human Lung Tissue Through Multi-modal Data Integration

Autores
Moraes, A; Moreno, M; Ribeiro, R; Ferreira, G;

Publicação
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Abstract
The accurate prediction of biological age can bring important benefits in promoting therapeutic and behavioural strategies for healthy aging. We propose the development of age prediction models using multi-modal datasets, including transcriptomics, methylation and histological images from lung tissue samples of 793 human donors. From a technical point of view this is a challenging problem since not all donors are covered by the same data modalities and the datasets have a very high feature dimensionality with a relatively smaller number of samples. To fairly compare performance across different data types, we’ve created a test set including donors represented in each modality. Given the unique characteristics of the data distribution, we developed gradient boosting tree and convolutional neural network models for each dataset. The performance of the models can be affected by several covariates, including smoking history, and, most importantly, by a skewed distribution of age. Data-centric approaches, including feature engineering, feature selection, data stratification and resampling, proved fundamental in building models that were optimally adapted for each data modality, resulting in significant improvements in model performance for imbalanced regression. The models were then applied to the test set independently, and later combined into a multi-modal ensemble through a voting strategy, predicting age with a median absolute error of 4 years. Even if prediction accuracy remains a challenge, in this work we provide insights to address the difficulties of multi-modal data integration and imbalanced data prediction. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.

2023

Mapeamento do Perfil das Mulheres Brasileiras em Processamento de Linguagem Natural

Autores
Caseli, H; Amorim, E; Schneider, ETR; Freitas, LIA; Rodrigues, J; Nunes, MdGV;

Publicação
Anais do XVII Women in Information Technology (WIT 2023)

Abstract
Conhecer o perfil das mulheres brasileiras que atuam em Processamento de Linguagem Natural (PLN) é um importante passo para o desenvolvimento de políticas e programas que visem aumentar a inclusão e a diversidade nessa área. Este é o primeiro trabalho realizado no Brasil com este fim. A partir de dados coletados via consulta pública, Lattes e Linkedin, notou-se que o perfil é de uma formação em computação ou linguística, atuando em empresas ou universidades, mas com pouca diversidade étnica e aparente dificuldade em conciliar vida profissional e maternidade. Analisando mais especificamente o grupo “Brasileiras em PLN” constatou-se uma expressiva capacidade de publicação e orientação, mas ainda uma baixa colaboração entre nossas integrantes.

2023

One-Step Discrete Fourier Transform-Based Sinusoid Frequency Estimation under Full-Bandwidth Quasi-Harmonic Interference

Autores
Silva, JM; Oliveira, MA; Saraiva, AF; Ferreira, AJS;

Publicação
ACOUSTICS

Abstract
The estimation of the frequency of sinusoids has been the object of intense research for more than 40 years. Its importance in classical fields such as telecommunications, instrumentation, and medicine has been extended to numerous specific signal processing applications involving, for example, speech, audio, and music processing. In many cases, these applications run in real-time and, thus, require accurate, fast, and low-complexity algorithms. Taking the normalized Cramer-Rao lower bound as a reference, this paper evaluates the relative performance of nine non-iterative discrete Fourier transform-based individual sinusoid frequency estimators when the target sinusoid is affected by full-bandwidth quasi-harmonic interference, in addition to stationary noise. Three levels of the quasi-harmonic interference severity are considered: no harmonic interference, mild harmonic interference, and strong harmonic interference. Moreover, the harmonic interference is amplitude-modulated and frequency-modulated reflecting real-world conditions, e.g., in singing and musical chords. Results are presented for when the Signal-to-Noise Ratio varies between -10 dB and 70 dB, and they reveal that the relative performance of different frequency estimators depends on the SNR and on the selectivity and leakage of the window that is used, but also changes drastically as a function of the severity of the quasi-harmonic interference. In particular, when this interference is strong, the performance curves of the majority of the tested frequency estimators collapse to a few trends around and above 0.4% of the DFT bin width.

2023

Analysis and Re-Synthesis of Natural Cricket Sounds Assessing the Perceptual Relevance of Idiosyncratic Parameters

Autores
Oliveira, M; Almeida, V; Silva, J; Ferreira, A;

Publicação
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Abstract
Cricket sounds are usually regarded as pleasant and, thus, can be used as suitable test signals in psychoacoustic experiments assessing the human listening acuity to specific temporal and spectral features. In addition, the simple structure of cricket sounds makes them prone to reverse engineering such that they can be analyzed and re-synthesized with desired alterations in their defining parameters. This paper describes cricket sounds from a parametric point of view, characterizes their main temporal and spectral features, namely jitter, shimmer and frequency sweeps, and explains a re-synthesis process generating modified natural cricket sounds. These are subsequently used in listening tests helping to shed light on the sound identification and discrimination capabilities of humans that are important, for example, in voice recognition. © 2023 IEEE.

2023

Time Series of Counts under Censoring: A Bayesian Approach

Autores
Silva, I; Silva, ME; Pereira, I; McCabe, B;

Publicação
ENTROPY

Abstract
Censored data are frequently found in diverse fields including environmental monitoring, medicine, economics and social sciences. Censoring occurs when observations are available only for a restricted range, e.g., due to a detection limit. Ignoring censoring produces biased estimates and unreliable statistical inference. The aim of this work is to contribute to the modelling of time series of counts under censoring using convolution closed infinitely divisible (CCID) models. The emphasis is on estimation and inference problems, using Bayesian approaches with Approximate Bayesian Computation (ABC) and Gibbs sampler with Data Augmentation (GDA) algorithms.

2023

Automatic characterisation of Dansgaard-Oeschger events in palaeoclimate ice records

Autores
Barbosa, S; Silva, ME; Dias, N; Rousseau, D;

Publicação

Abstract
Greenland ice core records display abrupt transitions, designated as Dansgaard-Oeschger (DO) events, characterised by episodes of rapid warming (typically decades) followed by a slower cooling. The identification of abrupt transitions is hindered by the typical low resolution and small size of paleoclimate records, and their significant temporal variability. Furthermore, the amplitude and duration of the DO events varies substantially along the last glacial period, which further hinders the objective identification of abrupt transitions from ice core records Automatic, purely data-driven methods, have the potential to foster the identification of abrupt transitions in palaeoclimate time series in an objective way, complementing the traditional identification of transitions by visual inspection of the time series.In this study we apply an algorithmic time series method, the Matrix Profile approach, to the analysis of the NGRIP Greenland ice core record, focusing on:- the ability of the method to retrieve in an automatic way abrupt transitions, by comparing the anomalies identified by the matrix profile method with the expert-based identification of DO events;- the characterisation of DO events, by classifying DO events in terms of shape and identifying events with similar warming/cooling temporal patternThe results for the NGRIP time series show that the matrix profile approach struggles to retrieve all the abrupt transitions that are identified by experts as DO events, the main limitation arising from the diversity in length of DO events and the method’s dependence on fixed-size sub-sequences within the time series. However, the matrix profile method is able to characterise the similarity of shape patterns between DO events in an objective and consistent way.

  • 40
  • 429