Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Sobre

Sobre

Ricardo Campos é professor auxiliar do Departamento de Informática da Universidade da Beira Interior (UBI) e Professor convidado da Porto Business School. É investigador sénior do LIAAD-INESC TEC, Laboratório de Inteligência Artificial e Apoio à Decisão da Universidade do Porto, e colaborador do Ci2.ipt, Centro de Investigação em Cidades Inteligentes do Instituto Politécnico de Tomar. É doutorado em Ciências da Computação pela Universidade do Porto (U. Porto), mestre e licenciado pela Universidade da Beira Interior (UBI). Possui mais de 10 anos de experiência de investigação nas áreas de recuperação de informação e processamento da linguagem natural, período durante o qual o seu trabalho foi distinguido com vários prémios de mérito científico em conferências internacionais e competições científicas. É autor do software de extração de keywords YAKE!, do projeto Conta-me Histórias e Arquivo Público, entre outros. Participou em vários projetos de investigação financiados pela FCT. A sua investigação foca-se no desenvolvimento de métodos relacionados com o processo de extração de narrativas a partir de textos, em particular na identificação e no relacionamento entre entidades, eventos e os seus aspetos temporais. Co-organizou conferências e workshops internacionais na área da recuperação de informação, e é regularmente membro do comité científico de várias conferências internacionais. É também membro do editorial board do International Journal of Data Science and Analytics (Springer) e do Information Processing and Management Journal (Elsevier). É membro do fórum de aconselhamento científico da Portulan Clarin - Infraestrutura de Investigação para a Ciência e Tecnologia da Linguagem, que pertence ao Roteiro Nacional de Infraestruturas de Investigação de Relevância Estratégica. Para mais informações clique aqui.

Tópicos
de interesse
Detalhes

Detalhes

  • Nome

    Ricardo Campos
  • Cargo

    Investigador Sénior
  • Desde

    01 julho 2012
002
Publicações

2024

Pre-trained language models: What do they know?

Autores
Guimaraes, N; Campos, R; Jorge, A;

Publicação
WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY

Abstract
Large language models (LLMs) have substantially pushed artificial intelligence (AI) research and applications in the last few years. They are currently able to achieve high effectiveness in different natural language processing (NLP) tasks, such as machine translation, named entity recognition, text classification, question answering, or text summarization. Recently, significant attention has been drawn to OpenAI's GPT models' capabilities and extremely accessible interface. LLMs are nowadays routinely used and studied for downstream tasks and specific applications with great success, pushing forward the state of the art in almost all of them. However, they also exhibit impressive inference capabilities when used off the shelf without further training. In this paper, we aim to study the behavior of pre-trained language models (PLMs) in some inference tasks they were not initially trained for. Therefore, we focus our attention on very recent research works related to the inference capabilities of PLMs in some selected tasks such as factual probing and common-sense reasoning. We highlight relevant achievements made by these models, as well as some of their current limitations that open opportunities for further research.This article is categorized under:Fundamental Concepts of Data and Knowledge > Key Design Issues in DataMiningTechnologies > Artificial Intelligence

2024

Indexing Portuguese NLP Resources with PT-Pump-Up

Autores
Almeida, R; Campos, R; Jorge, A; Nunes, S;

Publicação
CoRR

Abstract

2024

<i>Physio</i>: An LLM-Based Physiotherapy Advisor

Autores
Almeida, R; Sousa, H; Cunha, LF; Guimaraes, N; Campos, R; Jorge, A;

Publicação
ADVANCES IN INFORMATION RETRIEVAL, ECIR 2024, PT V

Abstract
The capabilities of the most recent language models have increased the interest in integrating them into real-world applications. However, the fact that these models generate plausible, yet incorrect text poses a constraint when considering their use in several domains. Healthcare is a prime example of a domain where text-generative trustworthiness is a hard requirement to safeguard patient well-being. In this paper, we present Physio, a chat-based application for physical rehabilitation. Physio is capable of making an initial diagnosis while citing reliable health sources to support the information provided. Furthermore, drawing upon external knowledge databases, Physio can recommend rehabilitation exercises and over-the-counter medication for symptom relief. By combining these features, Physio can leverage the power of generative models for language processing while also conditioning its response on dependable and verifiable sources. A live demo of Physio is available at https://physio.inesctec.pt.

2024

Text2Story Lusa: A Dataset for Narrative Analysis in European Portuguese News Articles

Autores
Nunes, S; Jorge, AM; Amorim, E; Sousa, HO; Leal, A; Silvano, PM; Cantante, I; Campos, R;

Publicação
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC/COLING 2024, 20-25 May, 2024, Torino, Italy.

Abstract
Narratives have been the subject of extensive research across various scientific fields such as linguistics and computer science. However, the scarcity of freely available datasets, essential for studying this genre, remains a significant obstacle. Furthermore, datasets annotated with narratives components and their morphosyntactic and semantic information are even scarcer. To address this gap, we developed the Text2Story Lusa datasets, which consist of a collection of news articles in European Portuguese. The first datasets consists of 357 news articles and the second dataset comprises a subset of 117 manually densely annotated articles, totaling over 50 thousand individual annotations. By focusing on texts with substantial narrative elements, we aim to provide a valuable resource for studying narrative structures in European Portuguese news articles. On the one hand, the first dataset provides researchers with data to study narratives from various perspectives. On the other hand, the annotated dataset facilitates research in information extraction and related tasks, particularly in the context of narrative extraction pipelines. Both datasets are made available adhering to FAIR principles, thereby enhancing their utility within the research community.

2024

The 7th International Workshop on Narrative Extraction from Texts: Text2Story 2024

Autores
Campos, R; Jorge, A; Jatowt, A; Bhatia, S; Litvak, M;

Publicação
ADVANCES IN INFORMATION RETRIEVAL, ECIR 2024, PT V

Abstract
The Text2Story Workshop series, dedicated to Narrative Extraction from Texts, has been running successfully since 2018. Over the past six years, significant progress, largely propelled by Transformers and Large Language Models, has advanced our understanding of natural language text. Nevertheless, the representation, analysis, generation, and comprehensive identification of the different elements that compose a narrative structure remains a challenging objective. In its seventh edition, the workshop strives to consolidate a common platform and a multidisciplinary community for discussing and addressing various issues related to narrative extraction tasks. In particular, we aim to bring to the forefront the challenges involved in understanding narrative structures and integrating their representation into established frameworks, as well as in modern architectures (e.g., transformers) and AI-powered language models (e.g., chatGPT) which are now common and form the backbone of almost every IR and NLP application. Text2Story encompasses sessions covering full research papers, work-in-progress, demos, resources, position and dissemination papers, along with keynote talks. Moreover, there is dedicated space for informal discussions on methods, challenges, and the future of research in this dynamic field.