Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by Ricardo Campos

2023

Preface

Authors
Campos, R; Jorge, AM; Jatowt, A; Bhatia, S; Litvak, M;

Publication
CEUR Workshop Proceedings

Abstract
[No abstract available]

2024

The 7th International Workshop on Narrative Extraction from Texts: Text2Story 2024

Authors
Campos, R; Jorge, A; Jatowt, A; Bhatia, S; Litvak, M;

Publication
ADVANCES IN INFORMATION RETRIEVAL, ECIR 2024, PT V

Abstract
The Text2Story Workshop series, dedicated to Narrative Extraction from Texts, has been running successfully since 2018. Over the past six years, significant progress, largely propelled by Transformers and Large Language Models, has advanced our understanding of natural language text. Nevertheless, the representation, analysis, generation, and comprehensive identification of the different elements that compose a narrative structure remains a challenging objective. In its seventh edition, the workshop strives to consolidate a common platform and a multidisciplinary community for discussing and addressing various issues related to narrative extraction tasks. In particular, we aim to bring to the forefront the challenges involved in understanding narrative structures and integrating their representation into established frameworks, as well as in modern architectures (e.g., transformers) and AI-powered language models (e.g., chatGPT) which are now common and form the backbone of almost every IR and NLP application. Text2Story encompasses sessions covering full research papers, work-in-progress, demos, resources, position and dissemination papers, along with keynote talks. Moreover, there is dedicated space for informal discussions on methods, challenges, and the future of research in this dynamic field.

2023

Towards Timeline Generation with Abstract Meaning Representation

Authors
Mansouri, B; Campos, R; Jatowt, A;

Publication
COMPANION OF THE WORLD WIDE WEB CONFERENCE, WWW 2023

Abstract
Timeline summarization (TLS) is a challenging research task that requires researchers to distill extensive and intricate temporal data into a concise and easily comprehensible representation. This paper proposes a novel approach to timeline summarization using Abstract Meaning Representations (AMRs), a graphical representation of the text where the nodes are semantic concepts and the edges denote relationships between concepts. With AMR, sentences with different wordings, but similar semantics, have similar representations. To make use of this feature for timeline summarization, a two-step sentence selection method that leverages features extracted from both AMRs and the text is proposed. First, AMRs are generated for each sentence. Sentences are then filtered out by removing those with no named-entities and keeping the ones with the highest number of named-entities. In the next step, sentences to appear in the timeline are selected based on two scores: Inverse Document Frequency (IDF) of AMR nodes combined with the score obtained by applying a keyword extraction method to the text. Our experimental results on the TLS-Covid19 test collection demonstrate the potential of the proposed approach.

2024

Pre-trained language models: What do they know?

Authors
Guimarães, N; Campos, R; Jorge, A;

Publication
WIREs Data. Mining. Knowl. Discov.

Abstract

2024

Keywords attention for fake news detection using few positive labels

Authors
de Souza, MC; Golo, MPS; Jorge, AMG; de Amorim, ECF; Campos, RNT; Marcacini, RM; Rezende, SO;

Publication
INFORMATION SCIENCES

Abstract
Fake news detection (FND) tools are essential to increase the reliability of information in social media. FND can be approached as a machine learning classification problem so that discriminative features can be automatically extracted. However, this requires a large news set, which in turn implies a considerable amount of human experts' effort for labeling. In this paper, we explore Positive and Unlabeled Learning (PUL) to reduce the labeling cost. In particular, we improve PUL with the network-based Label Propagation (PU-LP) algorithm. PU-LP achieved competitive results in FND exploiting relations between news and terms and using few labeled fake news. We propose integrating an attention mechanism in PU-LP that can define which terms in the network are more relevant for detecting fake news. We use GNEE, a state-of-the-art algorithm based on graph attention networks. Our proposal outperforms state-of-the-art methods, improving F-1 in 2% to 10%, especially when only 10% labeled fake news are available. It is competitive with the binary baseline, even when nearly half of the data is labeled. Discrimination ability is also visualized through t-SNE. We also present an analysis of the limitations of our approach according to the type of text found in each dataset.

2024

Special issue on selected papers from ICADL 2022

Authors
Jatowt, A; Katsurai, M; Pozi, MSM; Campos, R;

Publication
INTERNATIONAL JOURNAL ON DIGITAL LIBRARIES

Abstract
[No abstract available]

  • 18
  • 20