Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
About

About

I am an associate professor at the Department of Computer Science of the Faculty of Science of the University of Porto and the coordinator of LIAAD , the Artificial Intelligence and Decision Support Lab of UP. LIAAD is a unit of INESC TEC (Laboratório Associado) since 2007. I am a PhD in Computer Science by U. Porto, MSc. on Foundations of Advanced Information Technology by the Imperial Collegeand BSc. in Applied Maths and Computer Science, currently Computer Science (U. Porto). My research interests are Data Mining and Machine Learning, in particular association rules, web and text intelligence and data mining for decision support. My past research also includes Inductive Logic Programming and Collaborative Data Mining. I lecture courses related to programming, information processing, data mining, and other areas of computing. While at the Faculty of Economics, where I stayed from 1996 to 2009, I launched, with other colleagues, the MSc. on Data Analysis and Decisison Support Systems, which I coordinated from 2000 to April 2008. I lead research projects on data mining and web intelligence. I was the director of the Masters in Computer Science at DCC-FCUP from June 2010 to August 2013. I co-chaired international conferences (ECML/PKD 2015, Discovery Science 2009, ECML/PKDD 05 and EPIA 01), workshops and seminars in data mining and artificial intelligence. I was Vice-President of APPIA the Portuguese Association for Artificial Intelligence.

Interest
Topics
Details

Details

  • Name

    Alípio Jorge
  • Role

    Centre Coordinator
  • Since

    01st January 2008
020
Publications

2024

Pre-trained language models: What do they know?

Authors
Guimaraes, N; Campos, R; Jorge, A;

Publication
WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY

Abstract
Large language models (LLMs) have substantially pushed artificial intelligence (AI) research and applications in the last few years. They are currently able to achieve high effectiveness in different natural language processing (NLP) tasks, such as machine translation, named entity recognition, text classification, question answering, or text summarization. Recently, significant attention has been drawn to OpenAI's GPT models' capabilities and extremely accessible interface. LLMs are nowadays routinely used and studied for downstream tasks and specific applications with great success, pushing forward the state of the art in almost all of them. However, they also exhibit impressive inference capabilities when used off the shelf without further training. In this paper, we aim to study the behavior of pre-trained language models (PLMs) in some inference tasks they were not initially trained for. Therefore, we focus our attention on very recent research works related to the inference capabilities of PLMs in some selected tasks such as factual probing and common-sense reasoning. We highlight relevant achievements made by these models, as well as some of their current limitations that open opportunities for further research.This article is categorized under:Fundamental Concepts of Data and Knowledge > Key Design Issues in DataMiningTechnologies > Artificial Intelligence

2024

Indexing Portuguese NLP Resources with PT-Pump-Up

Authors
Almeida, R; Campos, R; Jorge, A; Nunes, S;

Publication
CoRR

Abstract

2024

<i>Physio</i>: An LLM-Based Physiotherapy Advisor

Authors
Almeida, R; Sousa, H; Cunha, LF; Guimaraes, N; Campos, R; Jorge, A;

Publication
ADVANCES IN INFORMATION RETRIEVAL, ECIR 2024, PT V

Abstract
The capabilities of the most recent language models have increased the interest in integrating them into real-world applications. However, the fact that these models generate plausible, yet incorrect text poses a constraint when considering their use in several domains. Healthcare is a prime example of a domain where text-generative trustworthiness is a hard requirement to safeguard patient well-being. In this paper, we present Physio, a chat-based application for physical rehabilitation. Physio is capable of making an initial diagnosis while citing reliable health sources to support the information provided. Furthermore, drawing upon external knowledge databases, Physio can recommend rehabilitation exercises and over-the-counter medication for symptom relief. By combining these features, Physio can leverage the power of generative models for language processing while also conditioning its response on dependable and verifiable sources. A live demo of Physio is available at https://physio.inesctec.pt.

2024

Heterogeneity in families with ATTRV30M amyloidosis: a historical and longitudinal Portuguese case study impact for genetic counselling

Authors
Pedroto, M; Coelho, T; Fernandes, J; Oliveira, A; Jorge, A; Mendes Moreira, J;

Publication
AMYLOID-JOURNAL OF PROTEIN FOLDING DISORDERS

Abstract
BackgroundHereditary transthyretin amyloidosis (ATTRv amyloidosis) is an inherited disease, where the study of family history holds importance. This study evaluates the changes of age-of-onset (AOO) and other age-related clinical factors within and among families affected by ATTRv amyloidosis.MethodsWe analysed information from 934 trees, focusing on family, parents, probands and siblings relationships. We focused on 1494 female and 1712 male symptomatic ATTRV30M patients. Results are presented alongside a comparison of current with historical records. Clinical and genealogical indicators identify major changes.ResultsOverall, analysis of familial data shows the existence of families with both early and late patients (1/6). It identifies long familial follow-up times since patient families tend to be diagnosed over several years. Finally, results show a large difference between parent-child and proband-patient relationships (20-30 years).ConclusionsThis study reveals that there has been a shift in patient profile, with a recent increase in male elderly cases, especially regarding probands. It shows that symptomatic patients exhibit less variability towards siblings, when compared to other family members, namely the transmitting ancestors' age of onset. This can influence genetic counselling guidelines.

2024

Text2Story Lusa: A Dataset for Narrative Analysis in European Portuguese News Articles

Authors
Nunes, S; Jorge, AM; Amorim, E; Sousa, HO; Leal, A; Silvano, PM; Cantante, I; Campos, R;

Publication
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC/COLING 2024, 20-25 May, 2024, Torino, Italy.

Abstract
Narratives have been the subject of extensive research across various scientific fields such as linguistics and computer science. However, the scarcity of freely available datasets, essential for studying this genre, remains a significant obstacle. Furthermore, datasets annotated with narratives components and their morphosyntactic and semantic information are even scarcer. To address this gap, we developed the Text2Story Lusa datasets, which consist of a collection of news articles in European Portuguese. The first datasets consists of 357 news articles and the second dataset comprises a subset of 117 manually densely annotated articles, totaling over 50 thousand individual annotations. By focusing on texts with substantial narrative elements, we aim to provide a valuable resource for studying narrative structures in European Portuguese news articles. On the one hand, the first dataset provides researchers with data to study narratives from various perspectives. On the other hand, the annotated dataset facilitates research in information extraction and related tasks, particularly in the context of narrative extraction pipelines. Both datasets are made available adhering to FAIR principles, thereby enhancing their utility within the research community.

Supervised
thesis

2023

Domain-specific and Context-aware Approaches to Sentiment Analysis

Author
Shamsuddeen Hassan Muhammad

Institution
UP-FCUP

2023

Digital technology and the social monitoring of climate change

Author
Ana Sofia Cabral Cardoso

Institution
UP-FCUP

2023

Building Portuguese Language Resources for Natural Language Processing Tasks

Author
Rúben Filipe Seabra de Almeida

Institution
UP-FCUP

2023

Heart Sound Analysis for Cardiovascular Diseases Identification

Author
Diogo Marcelo Esterlita Nogueira

Institution
UP-FCUP

2023

Product Complaint Understanding using NLP Techniques

Author
Beatriz Marques Arcipreste

Institution
UP-FCUP