Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by LIAAD

2023

Contrastive text summarization: a survey

Authors
Ströhle, T; Campos, R; Jatowt, A;

Publication
INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS

Abstract
In our data-flooded age, an enormous amount of redundant, but also disparate textual data is collected on a daily basis on a wide variety of topics. Much of this information refers to documents related to the same theme, that is, different versions of the same document, or different documents discussing the same topic. Being aware of such differences turns out to be an important aspect for those who want to perform a comparative task. However, as documents increase in size and volume, keeping up-to-date, detecting, and summarizing relevant changes between different documents or versions of it becomes unfeasible. This motivates the rise of the contrastive or comparative summarization task, which attempts to summarize the text of different documents related to the same topic in a way that highlights the relevant differences between them. Our research aims to provide a systematic literature review on contrastive or comparative summarization, highlighting the different methods, data sets, metrics, and applications. Overall, we found that contrastive summarization is most commonly used in controversial news articles, controversial opinions or sentiments on a topic, and reviews of a product. Despite the great interest in the topic, we note that standard data sets, as well as a competitive task dedicated to this topic, are yet to come to be proposed, eventually impeding the emergence of new methods. Moreover, the great breakthrough of using deep learning-based language models for abstract summaries in contrastive summarization is still missing.

2023

Contrastive Keyword Extraction from Versioned Documents

Authors
Eder, L; Campos, R; Jatowt, A;

Publication
PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023

Abstract
Versioned documents are common in many situations and play a vital part in numerous applications enabling an overview of the revisions made to a document or document collection. However, as documents increase in size, it gets difficult to summarize and comprehend all the changes made to versioned documents. In this paper, we propose a novel research problem of contrastive keyword extraction from versioned documents, and introduce an unsupervised approach that extracts keywords to reflect the key changes made to an earlier document version. In order to provide an easy-to-use comparison and summarization tool, an open-source demonstration is made available which can be found at https://contrastive-keyword-extraction.streamlit.app/.

2023

Preface

Authors
Litvak, M; Rabaev, I; Campos, R; Jorge, M; Jatowt, A;

Publication
CEUR Workshop Proceedings

Abstract
[No abstract available]

2023

Report on the 1st Workshop on Implicit Author Characterization from Texts for Search and Retrieval (IACT 2023) at SIGIR 2023

Authors
Litvak, M; Rabaev, I; Campos, R; Jorge, AM; Jatowt, A;

Publication
SIGIR Forum

Abstract

2023

FALQU: Finding Answers to Legal Questions

Authors
Mansouri, B; Campos, R;

Publication
CoRR

Abstract

2023

AIIR and LIAAD Labs Systems for CLEF 2023 SimpleText

Authors
Mansouri, B; Durgin, S; Franklin, S; Fletcher, S; Campos, R;

Publication
Working Notes of the Conference and Labs of the Evaluation Forum (CLEF 2023), Thessaloniki, Greece, September 18th to 21st, 2023.

Abstract
This paper describes the participation of the Artificial Intelligence and Information Retrieval (AIIR) Lab from the University of Southern Maine and the Laboratory of Artificial Intelligence and Decision Support (LIAAD) lab from INESC TEC in the CLEF 2023 SimpleText lab. There are three tasks defined for SimpleText: (T1) What is in (or out)?, (T2) What is unclear?, and (T3) Rewrite this!. Five runs were submitted for Task 1 using traditional Information Retrieval, and Sentence-BERT models. For Task 2, three runs were submitted, using YAKE! and KBIR keyword extraction models. Finally, for Task 3, two models were deployed, one using OpenAI Davinci embeddings and the other combining two unsupervised simplification models.

  • 43
  • 440