Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu

Publications by Ricardo Campos


FALQU: Finding Answers to Legal Questions

Mansouri, B; Campos, R;




AIIR and LIAAD Labs Systems for CLEF 2023 SimpleText

Mansouri, B; Durgin, S; Franklin, S; Fletcher, S; Campos, R;

Working Notes of the Conference and Labs of the Evaluation Forum (CLEF 2023), Thessaloniki, Greece, September 18th to 21st, 2023.

This paper describes the participation of the Artificial Intelligence and Information Retrieval (AIIR) Lab from the University of Southern Maine and the Laboratory of Artificial Intelligence and Decision Support (LIAAD) lab from INESC TEC in the CLEF 2023 SimpleText lab. There are three tasks defined for SimpleText: (T1) What is in (or out)?, (T2) What is unclear?, and (T3) Rewrite this!. Five runs were submitted for Task 1 using traditional Information Retrieval, and Sentence-BERT models. For Task 2, three runs were submitted, using YAKE! and KBIR keyword extraction models. Finally, for Task 3, two models were deployed, one using OpenAI Davinci embeddings and the other combining two unsupervised simplification models.


Is this news article still relevant? Ranking by contemporary relevance in archival search

Jatowt, A; Sato, M; Draxl, S; Duan, YJ; Campos, R; Yoshikawa, M;


Our civilization creates enormous volumes of digital data, a substantial fraction of which is preserved and made publicly available for present and future usage. Additionally, historical born-analog records are progressively being digitized and incorporated into digital document repositories. While professionals often have a clear idea of what they are looking for in document archives, average users are likely to have no precise search needs when accessing available archives (e.g., through their online interfaces). Thus, if the results are to be relevant and appealing to average people, they should include engaging and recognizable material. However, state-of-the-art document archival retrieval systems essentially use the same approaches as search engines for synchronic document collections. In this article, we develop unique ranking criteria for assessing the usefulness of archived contents based on their estimated relationship with current times, which we call contemporary relevance. Contemporary relevance may be utilized to enhance access to archival document collections, increasing the likelihood that users will discover interesting or valuable material. We next present an effective strategy for estimating contemporary relevance degrees of news articles by utilizing learning to rank approach based on a variety of diverse features, and we then successfully test it on the New York Times news collection. The incorporation of the contemporary relevance computation into archival retrieval systems should enable a new search style in which search results are meant to relate to the context of searchers' times, and by this have the potential to engage the archive users. As a proof of concept, we develop and demonstrate a working prototype of a simplified ranking model that operates on the top of the Portuguese Web Archive portal (


Report on the Second International Workshop on Narrative Extraction from Texts (Text2Story 2019)

Jorge, AM; Campos, R; Jatowt, A; Bhatia, S; Pasquali, A; Cordeiro, JP; Rocha, C; Mangaravite, V;


The Second International Workshop on Narrative Extraction from Texts (Text2Story'19 []) was held on the 14th of April 2019, in conjunction with the 41 st European Conference on Information Retrieval (ECIR 2019) in Cologne, Germany. The workshop provided a platform for researchers in IR, NLP, and design and visualization to come together and share the recent advances in extraction and formal representation of narratives. The workshop consisted of two invited talks, ten research paper presentations, and a poster and demo session. The proceedings of the workshop are available online at


Text2Story Lusa: A Dataset for Narrative Analysis in European Portuguese News Articles

Nunes, S; Jorge, AM; Amorim, E; Sousa, HO; Leal, A; Silvano, PM; Cantante, I; Campos, R;

Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC/COLING 2024, 20-25 May, 2024, Torino, Italy.

Narratives have been the subject of extensive research across various scientific fields such as linguistics and computer science. However, the scarcity of freely available datasets, essential for studying this genre, remains a significant obstacle. Furthermore, datasets annotated with narratives components and their morphosyntactic and semantic information are even scarcer. To address this gap, we developed the Text2Story Lusa datasets, which consist of a collection of news articles in European Portuguese. The first datasets consists of 357 news articles and the second dataset comprises a subset of 117 manually densely annotated articles, totaling over 50 thousand individual annotations. By focusing on texts with substantial narrative elements, we aim to provide a valuable resource for studying narrative structures in European Portuguese news articles. On the one hand, the first dataset provides researchers with data to study narratives from various perspectives. On the other hand, the annotated dataset facilitates research in information extraction and related tasks, particularly in the context of narrative extraction pipelines. Both datasets are made available adhering to FAIR principles, thereby enhancing their utility within the research community.



Campos, R; Jorge, AM; Jatowt, A; Bhatia, S; Rocha, C; Cordeiro, JP;

CEUR Workshop Proceedings


  • 17
  • 20