Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Publicações

Publicações por LIAAD

2023

Contrastive Keyword Extraction from Versioned Documents

Autores
Eder, L; Campos, R; Jatowt, A;

Publicação
PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023

Abstract
Versioned documents are common in many situations and play a vital part in numerous applications enabling an overview of the revisions made to a document or document collection. However, as documents increase in size, it gets difficult to summarize and comprehend all the changes made to versioned documents. In this paper, we propose a novel research problem of contrastive keyword extraction from versioned documents, and introduce an unsupervised approach that extracts keywords to reflect the key changes made to an earlier document version. In order to provide an easy-to-use comparison and summarization tool, an open-source demonstration is made available which can be found at https://contrastive-keyword-extraction.streamlit.app/.

2023

Preface

Autores
Litvak, M; Rabaev, I; Campos, R; Jorge, M; Jatowt, A;

Publicação
CEUR Workshop Proceedings

Abstract
[No abstract available]

2023

Report on the 1st Workshop on Implicit Author Characterization from Texts for Search and Retrieval (IACT 2023) at SIGIR 2023

Autores
Litvak, M; Rabaev, I; Campos, R; Jorge, AM; Jatowt, A;

Publicação
SIGIR Forum

Abstract

2023

FALQU: Finding Answers to Legal Questions

Autores
Mansouri, B; Campos, R;

Publicação
CoRR

Abstract

2023

AIIR and LIAAD Labs Systems for CLEF 2023 SimpleText

Autores
Mansouri, B; Durgin, S; Franklin, S; Fletcher, S; Campos, R;

Publicação
Working Notes of the Conference and Labs of the Evaluation Forum (CLEF 2023), Thessaloniki, Greece, September 18th to 21st, 2023.

Abstract
This paper describes the participation of the Artificial Intelligence and Information Retrieval (AIIR) Lab from the University of Southern Maine and the Laboratory of Artificial Intelligence and Decision Support (LIAAD) lab from INESC TEC in the CLEF 2023 SimpleText lab. There are three tasks defined for SimpleText: (T1) What is in (or out)?, (T2) What is unclear?, and (T3) Rewrite this!. Five runs were submitted for Task 1 using traditional Information Retrieval, and Sentence-BERT models. For Task 2, three runs were submitted, using YAKE! and KBIR keyword extraction models. Finally, for Task 3, two models were deployed, one using OpenAI Davinci embeddings and the other combining two unsupervised simplification models.

2023

Is this news article still relevant? Ranking by contemporary relevance in archival search

Autores
Jatowt, A; Sato, M; Draxl, S; Duan, YJ; Campos, R; Yoshikawa, M;

Publicação
INTERNATIONAL JOURNAL ON DIGITAL LIBRARIES

Abstract
Our civilization creates enormous volumes of digital data, a substantial fraction of which is preserved and made publicly available for present and future usage. Additionally, historical born-analog records are progressively being digitized and incorporated into digital document repositories. While professionals often have a clear idea of what they are looking for in document archives, average users are likely to have no precise search needs when accessing available archives (e.g., through their online interfaces). Thus, if the results are to be relevant and appealing to average people, they should include engaging and recognizable material. However, state-of-the-art document archival retrieval systems essentially use the same approaches as search engines for synchronic document collections. In this article, we develop unique ranking criteria for assessing the usefulness of archived contents based on their estimated relationship with current times, which we call contemporary relevance. Contemporary relevance may be utilized to enhance access to archival document collections, increasing the likelihood that users will discover interesting or valuable material. We next present an effective strategy for estimating contemporary relevance degrees of news articles by utilizing learning to rank approach based on a variety of diverse features, and we then successfully test it on the New York Times news collection. The incorporation of the contemporary relevance computation into archival retrieval systems should enable a new search style in which search results are meant to relate to the context of searchers' times, and by this have the potential to engage the archive users. As a proof of concept, we develop and demonstrate a working prototype of a simplified ranking model that operates on the top of the Portuguese Web Archive portal (arquivo.pt).

  • 32
  • 428