2022
Authors
Campos R.; Jorge A.M.; Jatowt A.; Bhatia S.; Litvak M.; Rocha C.; Cordeiro J.P.;
Publication
CEUR Workshop Proceedings
Abstract
2022
Authors
Luria, S; Campos, R;
Publication
Unlocking Environmental Narratives: Towards Understanding Human Environment Interactions through Computational Text Analysis
Abstract
2023
Authors
Campos, R; Correia, D; Jatowt, A;
Publication
ADVANCES IN INFORMATION RETRIEVAL, ECIR 2023, PT III
Abstract
Over the past fewdecades, the amount of information generated turned the Web into the largest knowledge infrastructure existing to date. Web archives have been at the forefront of data preservation, preventing the losses of significant data to humankind. Different snapshots of the web are saved everyday enabling users to surf the past web and to travel through this overtime. Despite these efforts, many people are not aware that the web is being preserved, often finding these infrastructures to be unattractive or difficult to use, when compared to common search engines. In this paper, we give a step towards making use of this preserved information to develop Public Archive an intuitive interface that enables end-users to search and analyze a large-scale of 67,242 past preserved news articles belonging to a Portuguese reference newspaper (Jornal Publico). The referred collection was obtained by scraping 10,976 versions of the homepage of the Jornal Publico preserved by the Portuguese web archive infrastructure (Arquivo.pt) during the time-period of 2010 to 2021. By doing this, we aim, not only to mark a stand in what respects to make use of this preserved information, but also to come up with an easy-to-follow solution, the Public Archive python package, which creates the roots to be used (with minor adaptations) by other news source providers interested in offering their readers access to past news articles.
2023
Authors
Goncalves, F; Campos, R; Jorge, A;
Publication
ADVANCES IN INFORMATION RETRIEVAL, ECIR 2023, PT III
Abstract
In recent years, the amount of information generated, consumed and stored has grown at an astonishing rate, making it difficult for those seeking information to extract knowledge in good time. This has become even more important, as the average reader is not as willing to spare more time out of their already busy schedule as in the past, thus prioritizing news in a summarized format, which are faster to digest. On top of that, people tend to increasingly rely on strong visual components to help them understand the focal point of news articles in a less tiresome manner. This growing demand, focused on exploring information through visual aspects, urges the need for the emergence of alternative approaches concerned with text understanding and narrative exploration. This motivated us to propose Text2Storyline, a platform for generating and exploring enriched storylines from an input text, a URL or a user query. The latter is to be issued on the PortugueseWebArchive (Arquivo.pt), therefore giving users the chance to expand their knowledge and build up on information collected from web sources of the past. To fulfill this objective, we propose a system that makes use of the TimeMatters algorithm to filter out non-relevant dates and organize relevant content by means of different displays: `Annotated Text', `Entities', `Storyline', `Temporal Clustering' and `Word Cloud'. To extend the users' knowledge, we rely on entity linking to connect persons, events, locations and concepts found in the text to Wikipedia pages, a process also known as Wikification. Each of the entities is then illustrated by means of an image collected from the Arquivo.pt.
2023
Authors
Silvano, P; Amorim, E; Leal, A; Cantante, I; Silva, F; Jorge, A; Campos, R; Nunes, S;
Publication
Proceedings of Text2Story - Sixth Workshop on Narrative Extraction From Texts held in conjunction with the 45th European Conference on Information Retrieval (ECIR 2023), Dublin, Ireland, April 2, 2023.
Abstract
News articles typically include reporting events to inform on what happened. These reporting events are not part of the story being told but are nonetheless a relevant part of the news and can pose a challenge to the computational processing of news narratives. They compose a reporting narrative, which is the present study's focus. This paper aims to demonstrate through selected use cases how a comprehensive annotation scheme with suitable tags and links can properly represent the reporting events and the way they relate to the events that make the story. In addition, we put forward a proposal for their visual representation that enables a systematic and detailed analysis of the importance of reporting events in the news structure. Finally, we describe some lexico-grammatical features of reporting events, which can contribute to their automatic detection. © 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
2023
Authors
Campos, R; Jorge, AM; Jatowt, A; Bhatia, S; Litvak, M;
Publication
Text2Story@ECIR
Abstract
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.