Publicacoes - INESC TEC

Publicações

Publicações por Alípio Jorge

2023

ORSUM 2023 - 6th Workshop on Online Recommender Systems and User Modeling

Autores
Vinagre, J; Ghossein, MA; Peska, L; Jorge, AM; Bifet, A;

Publicação
Proceedings of the 17th ACM Conference on Recommender Systems, RecSys 2023, Singapore, Singapore, September 18-22, 2023

Abstract
Modern online platforms for user modeling and recommendation require complex data infrastructures to collect and process data. Some of this data has to be kept to later be used in batches to train personalization models. However, since user activity data can be generated at very fast rates it is also useful to have algorithms able to process data streams online, in real time. Given the continuous and potentially fast change of content, context and user preferences or intents, stream-based models, and their synchronization with batch models can be extremely challenging. Therefore, it is important to investigate methods able to transparently and continuously adapt to the inherent dynamics of user interactions, preferably over long periods of time. Models able to continuously learn from such flows of data are gaining attention in the recommender systems community, and are being increasingly deployed in online platforms. However, many challenges associated with learning from streams need further investigation. The objective of this workshop is to foster contributions and bring together a growing community of researchers and practitioners interested in online, adaptive approaches to user modeling, recommendation and personalization, and their implications regarding multiple dimensions, such as reproducibility, privacy, fairness, diversity, transparency, auditability, and compliance with recently adopted or upcoming legal frameworks worldwide. © 2023 Owner/Author.

FecharLer Abstract

2023

Tweet2Story: Extracting Narratives from Twitter

Autores
Campos, V; Campos, R; Jorge, A;

Publicação
PROGRESS IN ARTIFICIAL INTELLIGENCE, EPIA 2023, PT I

Abstract
Topics discussed on social media platforms contain a disparate amount of information written in colloquial language, making it difficult to understand the narrative of the topic. In this paper, we take a step forward, towards the resolution of this problem by proposing a framework that performs the automatic extraction of narratives from a document, such as tweet posts. To this regard, we propose a methodology that extracts information from the texts through a pipeline of tasks, such as co-reference resolution and the extraction of entity relations. The result of this process is embedded into an annotation file to be used by subsequent operations, such as visualization schemas. We named this framework Tweet2Story and measured its effectiveness under an evaluation schema that involved three different aspects: (i) as an Open Information extraction (OpenIE) task, (ii) by comparing the narratives of manually annotated news articles linked to tweets about the same topic and (iii) by comparing their knowledge graphs, produced by the narratives, in a qualitative way. The results obtained show a high precision and a moderate recall, on par with other OpenIE state-of-the-art frameworks and confirm that the narratives can be extracted from small texts. Furthermore, we show that the narrative can be visualized in an easily understandable way.

FecharLer Abstract

2023

Event Extraction for Portuguese: A QA-Driven Approach Using ACE-2005

Autores
Cunha, LF; Campos, R; Jorge, A;

Publicação
PROGRESS IN ARTIFICIAL INTELLIGENCE, EPIA 2023, PT I

Abstract
Event extraction is an Information Retrieval task that commonly consists of identifying the central word for the event (trigger) and the event's arguments. This task has been extensively studied for English but lags behind for Portuguese, partly due to the lack of task-specific annotated corpora. This paper proposes a framework in which two separated BERT-based models were fine-tuned to identify and classify events in Portuguese documents. We decompose this task into two sub-tasks. Firstly, we use a token classification model to detect event triggers. To extract event arguments, we train a Question Answering model that queries the triggers about their corresponding event argument roles. Given the lack of event annotated corpora in Portuguese, we translated the original version of the ACE-2005 dataset (a reference in the field) into Portuguese, producing a new corpus for Portuguese event extraction. To accomplish this, we developed an automatic translation pipeline. Our framework obtains F1 marks of 64.4 for trigger classification and 46.7 for argument classification setting, thus a new state of the art reference for these tasks in Portuguese.

FecharLer Abstract

2023

Symbolic Versus Deep Learning Techniques for Explainable Sentiment Analysis

Autores
Muhammad, SH; Brazdil, P; Jorge, A;

Publicação
PROGRESS IN ARTIFICIAL INTELLIGENCE, EPIA 2023, PT I

Abstract
Deep learning approaches have become popular in many different areas, including sentiment analysis (SA), because of their competitive performance. However, the downside of this approach is that they do not provide understandable explanations on how the sentiment values are calculated. In contrast, previous approaches that used sentiment lexicons can do that, but their performance is normally not high. To leverage the strengths of both approaches, we present a neuro-symbolic approach that combines deep learning (DL) and symbolic methods for SA tasks. The DL approach uses a pre-trained language model (PLM) to construct sentiment lexicon. The symbolic approach exploits the constructed sentiment lexicon and manually constructed shifter patterns to determine the sentiment of a sentence. Our experimental results show that the proposed approach leads to promising results with the additional advantage that sentiment predictions can be accompanied by understandable explanations.

FecharLer Abstract

2023

Combining Neighbor Models to Improve Predictions of Age of Onset of ATTRv Carriers

Autores
Pedroto, M; Jorge, A; Mendes-Moreira, J; Coelho, T;

Publicação
PROGRESS IN ARTIFICIAL INTELLIGENCE, EPIA 2023, PT II

Abstract
Transthyretin (TTR)-related familial amyloid polyneuropathy (ATTRv) is a life-threatening autosomal dominant disease and the age of onset represents the moment when first symptoms are felt. Accurately predicting the age of onset for a given patient is relevant for risk assessment and treatment management. In this work, we evaluate the impact of combining prediction models obtained from neighboring time windows on prediction error. We propose Symmetric (Sym) and Asymmetric (Asym) models which represent two different averaging approaches. These are incorporated with a weighting mechanism as to create Symmetric (Sym), Symmetric-weighted (Sym-w), Asymmetric (Asym), and Asymmetric-weighted (Asym-w). These four ensemble models are then compared to the original approach which is focused on individual regression base learners namely: Baseline (BL), Decision Tree (DT), Elastic Net (EN), Lasso (LA), Linear Regression (LR), Random Forest (RF), Ridge (RI), Support Vector Regressor (SV) and XGBoost (XG). Our results show that by aggregating predictions from neighbor models the average mean absolute error obtained by each base learner decreases. Overall, the best results are achieved by regression-based ensemble tree models as base learners.

FecharLer Abstract

2023

Report on the 6th International Workshop on Narrative Extraction from Texts (Text2Story 2023) at ECIR 2023

Autores
Campos, R; Jorge, AM; Jatowt, A; Bhatia, S; Litvak, M; Cordeiro, JP; Rocha, C; Sousa, H; Mansouri, B;

Publicação
SIGIR Forum

Abstract