Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Sobre

Sobre

Sou professor associado do Departamento de Ciência de Computadores da Faculdade de Ciências da Universidade do Porto e coordenador do LIAAD, Laboratório de Inteligência Artificial e de Apoio à Decisão da UP. O LIAAD é um cenrto do INESC TEC desde 2007. Sou doutor em Ciência da Computação pela U. Porto, MSc. em Fundamentos de Tecnologia de Informação Avançada pelo Imperial College e Lic. Em Matemática Aplicada ramo Ciência de Computadores (U. Porto). Os meus interesses de investigação são Extração de Conhecimento (Data Mining) e Aprendizagem Automática (Machine Learning), em particular regras de associação, text mining e sistemas de recomendação. A minha investigação anterior inclui programação em lógica indutiva e data miing colaborativo. Eu leciono cursos relacionados com programação, processamento de informação, data mining e outras áreas da computação. Enquanto na Faculdade de Economia, onde permaneci de 1996 a 2009, lancei, com outros colegas, o mestrado em Análise de Dados e Sistemas de Apoio à Decisão (MADSAD), que coordenei de 2000 a Abril de 2008. Dirijo projetos em data mining e inteligência na web. Fui diretor do Mestrado em Ciência dos Computadores no DCC-FCUP de junho de 2010 a agosto de 2013. Co-organizei conferências internacionais (ECML / PKD 2015, Discovery Science 2009, ECML / PKDD 05 e EPIA 01), workshops e seminários em data mining e inteligência artificial. Fui Vice-Presidente da APPIA Associação Portuguesa para a Inteligência Artificial.

Tópicos
de interesse
Detalhes

Detalhes

  • Nome

    Alípio Jorge
  • Cargo

    Coordenador de Centro
  • Desde

    01 janeiro 2008
020
Publicações

2024

Pre-trained language models: What do they know?

Autores
Guimaraes, N; Campos, R; Jorge, A;

Publicação
WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY

Abstract
Large language models (LLMs) have substantially pushed artificial intelligence (AI) research and applications in the last few years. They are currently able to achieve high effectiveness in different natural language processing (NLP) tasks, such as machine translation, named entity recognition, text classification, question answering, or text summarization. Recently, significant attention has been drawn to OpenAI's GPT models' capabilities and extremely accessible interface. LLMs are nowadays routinely used and studied for downstream tasks and specific applications with great success, pushing forward the state of the art in almost all of them. However, they also exhibit impressive inference capabilities when used off the shelf without further training. In this paper, we aim to study the behavior of pre-trained language models (PLMs) in some inference tasks they were not initially trained for. Therefore, we focus our attention on very recent research works related to the inference capabilities of PLMs in some selected tasks such as factual probing and common-sense reasoning. We highlight relevant achievements made by these models, as well as some of their current limitations that open opportunities for further research.This article is categorized under:Fundamental Concepts of Data and Knowledge > Key Design Issues in DataMiningTechnologies > Artificial Intelligence

2024

Indexing Portuguese NLP Resources with PT-Pump-Up

Autores
Almeida, R; Campos, R; Jorge, A; Nunes, S;

Publicação
CoRR

Abstract

2024

<i>Physio</i>: An LLM-Based Physiotherapy Advisor

Autores
Almeida, R; Sousa, H; Cunha, LF; Guimaraes, N; Campos, R; Jorge, A;

Publicação
ADVANCES IN INFORMATION RETRIEVAL, ECIR 2024, PT V

Abstract
The capabilities of the most recent language models have increased the interest in integrating them into real-world applications. However, the fact that these models generate plausible, yet incorrect text poses a constraint when considering their use in several domains. Healthcare is a prime example of a domain where text-generative trustworthiness is a hard requirement to safeguard patient well-being. In this paper, we present Physio, a chat-based application for physical rehabilitation. Physio is capable of making an initial diagnosis while citing reliable health sources to support the information provided. Furthermore, drawing upon external knowledge databases, Physio can recommend rehabilitation exercises and over-the-counter medication for symptom relief. By combining these features, Physio can leverage the power of generative models for language processing while also conditioning its response on dependable and verifiable sources. A live demo of Physio is available at https://physio.inesctec.pt.

2024

Heterogeneity in families with ATTRV30M amyloidosis: a historical and longitudinal Portuguese case study impact for genetic counselling

Autores
Pedroto, M; Coelho, T; Fernandes, J; Oliveira, A; Jorge, A; Mendes Moreira, J;

Publicação
AMYLOID-JOURNAL OF PROTEIN FOLDING DISORDERS

Abstract
BackgroundHereditary transthyretin amyloidosis (ATTRv amyloidosis) is an inherited disease, where the study of family history holds importance. This study evaluates the changes of age-of-onset (AOO) and other age-related clinical factors within and among families affected by ATTRv amyloidosis.MethodsWe analysed information from 934 trees, focusing on family, parents, probands and siblings relationships. We focused on 1494 female and 1712 male symptomatic ATTRV30M patients. Results are presented alongside a comparison of current with historical records. Clinical and genealogical indicators identify major changes.ResultsOverall, analysis of familial data shows the existence of families with both early and late patients (1/6). It identifies long familial follow-up times since patient families tend to be diagnosed over several years. Finally, results show a large difference between parent-child and proband-patient relationships (20-30 years).ConclusionsThis study reveals that there has been a shift in patient profile, with a recent increase in male elderly cases, especially regarding probands. It shows that symptomatic patients exhibit less variability towards siblings, when compared to other family members, namely the transmitting ancestors' age of onset. This can influence genetic counselling guidelines.

2024

Text2Story Lusa: A Dataset for Narrative Analysis in European Portuguese News Articles

Autores
Nunes, S; Jorge, AM; Amorim, E; Sousa, HO; Leal, A; Silvano, PM; Cantante, I; Campos, R;

Publicação
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC/COLING 2024, 20-25 May, 2024, Torino, Italy.

Abstract
Narratives have been the subject of extensive research across various scientific fields such as linguistics and computer science. However, the scarcity of freely available datasets, essential for studying this genre, remains a significant obstacle. Furthermore, datasets annotated with narratives components and their morphosyntactic and semantic information are even scarcer. To address this gap, we developed the Text2Story Lusa datasets, which consist of a collection of news articles in European Portuguese. The first datasets consists of 357 news articles and the second dataset comprises a subset of 117 manually densely annotated articles, totaling over 50 thousand individual annotations. By focusing on texts with substantial narrative elements, we aim to provide a valuable resource for studying narrative structures in European Portuguese news articles. On the one hand, the first dataset provides researchers with data to study narratives from various perspectives. On the other hand, the annotated dataset facilitates research in information extraction and related tasks, particularly in the context of narrative extraction pipelines. Both datasets are made available adhering to FAIR principles, thereby enhancing their utility within the research community.

Teses
supervisionadas

2023

Predicting user personality from digital media

Autor
Ricardo da Cunha Magalhães Lopes

Instituição
UP-FCUP

2023

Learning Word Sense Representations from Neural Language Models

Autor
Daniel Alexandre Bouçanova Loureiro

Instituição
UP-FCUP

2023

Domain-specific and Context-aware Approaches to Sentiment Analysis

Autor
Shamsuddeen Hassan Muhammad

Instituição
UP-FCUP

2023

Digital technology and the social monitoring of climate change

Autor
Ana Sofia Cabral Cardoso

Instituição
UP-FCUP

2023

Building Portuguese Language Resources for Natural Language Processing Tasks

Autor
Rúben Filipe Seabra de Almeida

Instituição
UP-FCUP