Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Publicações

2026

A subject-based association network defines new pediatric sleep apnea phenotypes with different odds of recovery after treatment

Autores
Gutiérrez-Tobal, GC; Gomez-Pilar, J; Ferreira-Santos, D; Pereira-Rodrigues, P; Alvarez, D; del Campo, F; Gozal, D; Hornero, R;

Publicação
COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE

Abstract
Background and objectives: Timely treatment of pediatric obstructive sleep apnea (OSA) can prevent or reverse neurocognitive and cardiovascular morbidities. However, whether distinct phenotypes exist and account for divergent treatment effectiveness remains unknown. In this study, our goal is threefold: i) to define new data-driven pediatric OSA phenotypes, ii) to evaluate possible treatment effectiveness differences among them, and iii) to assess phenotypic information in predicting OSA resolution. Methods: We involved 22 sociodemographic, anthropometric, and clinical data from 464 children (5-10 years old) from the Childhood Adenotonsillectomy Trial (CHAT) database. Baseline information was used to automatically define pediatric OSA phenotypes using a new unsupervised subject-based association network. Follow-up data (7 months later) were used to evaluate the effects of the therapeutic intervention in terms of changes in the obstructive apnea-hypopnea index (OAHI) and the resolution of OSA (OAHI < 1 event per hour). An explainable artificial intelligence (XAI) approach was also developed to assess phenotypic information as OSA resolution predictor at baseline. Results: Our approach identified three OSA phenotypes (PHOSA1-PHOSA3), with PHOSA2 showing significantly lower odds of OSA recovery than PHOSA1 and PHOSA3 when treatment information was not considered (odds ratios, OR: 1.64 and 1.66, 95 % confidence intervals, CI: 1.03-2.62 and 1.01-2.69, respectively). The odds of OSA recovery were also significantly lower in PHOSA2 than in PHOSA3 when adenotonsillectomy was adopted as treatment (OR: 2.60, 95 % CI: 1.26-5.39). Our XAI approach identified 79.4 % (CI: 69.9-88.0 %) of children reaching OSA resolution after adenotonsillectomy, with a positive predictive value of 77.8 % (CI: 70.3 %-86.0 %). Conclusions: Our new subject-based association network successfully identified three clinically useful pediatric OSA phenotypes with different odds of therapeutic intervention effectiveness. Specifically, we found that children of any sex, >6 years old, overweight or obese, and with enlarged neck and waist circumference (PHOSA2) have less odds of recovering from OSA. Similarly, younger female children with no enlarged neck (PHOSA3) have higher odds of benefiting from adenotonsillectomy.

2026

Video-based epileptic seizure classification: A novel multi-stage approach integrating vision and motion transformer deep learning models

Autores
Aslani, R; Karácsony, T; Fearns, N; Caldeiras, C; Vollmar, C; Rego, R; Rémi, J; Noachtar, S; Cunha, JPS;

Publicação
BIOMEDICAL SIGNAL PROCESSING AND CONTROL

Abstract
Automated seizure quantification and classification are needed for semiology-based epileptic seizure diagnosis support. To the best of our knowledge, the 5-class (Hypermotor, Automotor, Complex Motor, Psychogenic Non-Epileptic Seizures, and Generalized Tonic-Clonic Seizures) seizure video dataset (198 seizures from 74 patients) studied in this paper is the largest 5-class dataset ever curated, composed of monocular RGB videos from two university hospital epilepsy monitoring units. 2D skeletons were estimated using ViTPose, a vision transformer deep learning (DL) architecture, and lifted to 3D space using MotionBERT, a multimodal motion transformer architecture. The movements were quantified based on the estimated 3D skeleton sequences. Two approaches were evaluated for seizure classification: (1) classical machine learning methods (Random Forest (RF) and XGBoost) applied to quantified movement parameters, and (2) 2D skeleton-based DL using MotionBERT action, an action recognition DL model, to which we perform transfer-learning. The best model achieved a promising, above literature, 5-fold cross-validated macro average F1-score of 0.84 +/- 0.09 (RF) for 5-class classification. The binary case (Automotor vs Hypermotor) resulted in 0.80 +/- 0.18 (MotionBERT action), and adding a 3rd class (Complex motor) lowered to 0.65 +/- 0.14 (RF). This novel multi-stage classification ensures that the included movement features are traceable, allowing interpretable AI exploration of this novel approach supporting future clinical diagnosis.

2026

Optimizing Medical Image Captioning with Conditional Prompt Encoding

Autores
Fernandes, RF; Oliveira, HS; Ribeiro, PP; Oliveira, HP;

Publicação
PATTERN RECOGNITION AND IMAGE ANALYSIS, IBPRIA 2025, PT II

Abstract
Medical image captioning is an essential tool to produce descriptive text reports of medical images. One of the central problems of medical image captioning is their poor domain description generation because large pre-trained language models are primarily trained in non-medical text domains with different semantics of medical text. To overcome this limitation, we explore improvements in contrastive learning for X-ray images complemented with soft prompt engineering for medical image captioning and conditional text decoding for caption generation. The main objective is to develop a softprompt model to improve the accuracy and clinical relevance of the automatically generated captions while guaranteeing their complete linguistic accuracy without corrupting the models' performance. Experiments on the MIMIC-CXR and ROCO datasets showed that the inclusion of tailored soft-prompts improved accuracy and efficiency, while ensuring a more cohesive medical context for captions, aiding medical diagnosis and encouraging more accurate reporting.

2026

Overview of the CLEF 2025 JOKER Lab: Humour in Machine

Autores
Ermakova, L; Campos, R; Bosser, AG; Miller, T;

Publicação
EXPERIMENTAL IR MEETS MULTILINGUALITY, MULTIMODALITY, AND INTERACTION, CLEF 2025

Abstract
Humour poses a unique challenge for artificial intelligence, as it often relies on non-literal language, cultural references, and linguistic creativity. The JOKER Lab, now in its fourth year, aims to advance computational humour research through shared tasks on curated, multilingual datasets, with applications in education, computer-mediated communication and translation, and conversational AI. This paper provides an overview of the JOKER Lab held at CLEF 2025, detailing the setup and results of its three main tasks: (1) humour-aware information retrieval, which involves searching a document collection for humorous texts relevant to user queries in either English or Portuguese; (2) pun translation, focussed on humour-preserving translation of paronomastic jokes from English into French; and (3) onomastic wordplay translation, a task addressing the translation of name-based wordplay from English into French. The 2025 edition builds upon previous iterations by expanding datasets and emphasising nuanced, manual evaluation methods. The Task 1 results show a marked improvement this year, apparently due to participants' judicious combination of retrieval and filtering techniques. Tasks 2 and 3 remain challenging, not only in terms of system performance but also in terms of defining meaningful and reliable evaluation metrics.

2026

Unsupervised contrastive analysis for anomaly detection in brain MRIs via conditional diffusion models

Autores
Patrício, C; Barbano, CA; Fiandrotti, A; Renzulli, R; Grangetto, M; Teixeira, LF; Neves, JC;

Publicação
PATTERN RECOGNITION LETTERS

Abstract
Contrastive Analysis (CA) detects anomalies by contrasting patterns unique to a target group (e.g., unhealthy subjects) from those in a background group (e.g., healthy subjects). In the context of brain MRIs, existing CA approaches rely on supervised contrastive learning or variational autoencoders (VAEs) using both healthy and unhealthy data, but such reliance on target samples is challenging in clinical settings. Unsupervised Anomaly Detection (UAD) learns a reference representation of healthy anatomy, eliminating the need for target samples. Deviations from this reference distribution can indicate potential anomalies. In this context, diffusion models have been increasingly adopted in UAD due to their superior performance in image generation compared to VAEs. Nonetheless, precisely reconstructing the anatomy of the brain remains a challenge. In this work, we bridge CA and UAD by reformulating contrastive analysis principles for the unsupervised setting. We propose an unsupervised framework to improve the reconstruction quality by training a self-supervised contrastive encoder on healthy images to extract meaningful anatomical features. These features are used to condition a diffusion model to reconstruct the healthy appearance of a given image, enabling interpretable anomaly localization via pixel-wise comparison. We validate our approach through a proof-of-concept on a facial image dataset and further demonstrate its effectiveness on four brain MRI datasets, outperforming baseline methods in anomaly localization on the NOVA benchmark.

2026

Challenges and Opportunities for Designing Digital Communication Interfaces for Persons with Partial Locked-In Syndrome

Autores
Amado, P; Penedos-Santiago, E; Lima, C; Simoes, S; Giesteira, B; Peçaibes, V;

Publicação
ARTSIT, INTERACTIVITY AND GAME CREATION, ARTSIT 2024, PT II

Abstract
This integrative literature review synthesizes insights from multiple disciplines to address the challenges and opportunities in designing digital communication interfaces for persons with Locked-In Syndrome (LIS). The paper highlights the importance of a multidisciplinary approach that includes ethical co-design, visual design principles, and Human-Computer Interaction (HCI). It emphasizes how important it is to have user-friendly, visually appealing, and accessible interfaces to help persons with LIS to communicate more effectively. Important technologies are evaluated for their potential to improve communication, including Augmented and Virtual Reality (AR & VR), Eye Tracking, and Brain-Computer Interfaces (BCI). To guarantee that the emerging technologies are both efficient and considerate of user demands, the review emphasizes the significance of ethical considerations and patient-centered design. This study intends to direct future design-based action research in constructing functional digital communication systems, using head-mounted Extended Reality (XR) technologies, by combining the various research findings from the review.

  • 17
  • 4399