Publications

Publications by Alípio Jorge

2022

Preface

Authors
Campos R.; Jorge A.M.; Jatowt A.; Bhatia S.; Litvak M.; Rocha C.; Cordeiro J.P.;

Publication
CEUR Workshop Proceedings

Abstract

2022

Improving the Prediction of Age of Onset of TTR-FAP Patients Using Graph-Embedding Features

Authors
Pedroto, M; Jorge, A; Mendes Moreira, J; Coelho, T;

Publication
PROGRESS IN ARTIFICIAL INTELLIGENCE, EPIA 2022

Abstract
Transthyretin Familial Amyloid Polyneuropathy (TTR-FAP) is a neurological genetic illness that inflicts severe symptoms after the onset occurs. Age of onset represents the moment a patient starts to experience the symptoms of a disease. An accurate prediction of this event can improve clinical and operational guidelines that define the work of doctors, nurses, and operational staff. In this work, we transform family trees into compact vectors, that is, embeddings, and handle these as input features to predict the age of onset of patients with TTR-FAP. Our purpose is to evaluate how information present in genealogical trees can be transformed and used to improve a regression-based setting for TTR-FAP age of onset prediction. Our results show that by combining manual and graph-embeddings features there is a decrease in the mean prediction error when there is less information regarding a patient's family. With this work, we open the way for future work in representation learning for genealogical data, enabling a more effective exploitation of machine learning approaches.

CloseRead Abstract

2022

Preface to the special issue on dynamic recommender systems and user models

Authors
Vinagre, J; Jorge, AM; Al-Ghossein, M; Bifet, A; Cremonesi, P;

Publication
USER MODELING AND USER-ADAPTED INTERACTION

Abstract
[No abstract available]

CloseRead Abstract

2022

The robustness of Random Forest and Support Vector Machine Algorithms to a Faulty Heart Sound Segmentation

Authors
Oliveira, J; Nogueira, DM; Ferreira, CA; Jorge, AM; Coimbra, MT;

Publication
44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, EMBC 2022, Glasgow, Scotland, United Kingdom, July 11-15, 2022

Abstract
Cardiac auscultation is the key exam to screen cardiac diseases both in developed and developing countries. A heart sound auscultation procedure can detect the presence of murmurs and point to a diagnosis, thus it is an important first-line assessment and also cost-effective tool. The design automatic recommendation systems based on heart sound auscultation can play an important role in boosting the accuracy and the pervasiveness of screening tools. One such as step, consists in detecting the fundamental heart sound states, a process known as segmentation. A faulty segmentation or a wrong estimation of the heart rate might result in an incapability of heart sound classifiers to detect abnormal waves, such as murmurs. In the process of understanding the impact of a faulty segmentation, several common heart sound segmentation errors are studied in detail, namely those where the heart rate is badly estimated and those where S1/S2 and Systolic/Diastolic states are swapped in comparison with the ground truth state sequence. From the tested algorithms, support vector machine (SVMs) and random forest (RFs) shown to be more sensitive to a wrong estimation of the heart rate (an expected drop of 6% and 8% on the overall performance, respectively) than to a swap in the state sequence of events (an expected drop of 1.9% and 4.6%, respectively).

CloseRead Abstract

2022

Can Multi-channel Heart Sounds Analysis improve Murmur Detection?

Authors
Nogueira, M; Oliveira, J; Ferreira, CG; Coimbra, MT; Jorge, AM;

Publication
2022 IEEE-EMBS INTERNATIONAL CONFERENCE ON BIOMEDICAL AND HEALTH INFORMATICS (BHI) JOINTLY ORGANISED WITH THE IEEE-EMBS INTERNATIONAL CONFERENCE ON WEARABLE AND IMPLANTABLE BODY SENSOR NETWORKS (BSN'22)

Abstract
Cardiac auscultation is still the most cost-effective screening procedure for cardiovascular diseases. The development of computer assisted methods can empower a large variety of health professionals and thus enable mass cardiac health low-cost screening. The procedure for correct cardiac auscultation includes listening to the heart sounds of the four main auscultation spots. Until recently, attempts to develop automatic heart sound analysis methods that explore the multi-channel richness of a real auscultation, were very difficult due to the lack of adequate public datasets. In this work, we use the CirCor Dataset which is characterized by the existence of more than one heart sound per patient (each patient has heart sounds collected at different auscultation spots). Using this dataset, we evaluate and quantify the comparative impact of using a single or a multichannel approach. A single channel approach uses the sound from a single auscultation spot, whereas a multi-channel approach uses four auscultation spots in an asynchronous way. From the different classifiers tested, models that use four auscultation spots achieved a higher overall performance than those that search for abnormalities in a single heart sound spot. Our best result is a multi-channel SVM that analyzes four auscultation spots, with an overall performance of 87,4 %. This opens the path to future research using a multi-channel approach.

CloseRead Abstract

2022

NaijaSenti: A Nigerian Twitter Sentiment Corpus for Multilingual Sentiment Analysis

Authors
Muhammad, SH; Adelani, DI; Ruder, S; Ahmad, IS; Abdulmumin, I; Bello, BS; Choudhury, M; Emezue, CC; Abdullahi, SS; Aremu, A; Jorge, A; Brazdil, P;

Publication
LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION

Abstract
Sentiment analysis is one of the most widely studied applications in NLP, but most work focuses on languages with large amounts of data. We introduce the first large-scale human-annotated Twitter sentiment dataset for the four most widely spoken languages in Nigeria-Hausa, Igbo, Nigerian-Pidgin, and Yoruba-consisting of around 30,000 annotated tweets per language, including a significant fraction of code-mixed tweets. We propose text collection, filtering, processing, and labeling methods that enable us to create datasets for these low-resource languages. We evaluate a range of pre-trained models and transfer strategies on the dataset. We find that language-specific models and language-adaptive fine-tuning generally perform best. We release the datasets, trained models, sentiment lexicons, and code to incentivize research on sentiment analysis in under-represented languages.

CloseRead Abstract