2021
Autores
Amorim, E; Ribeiro, A; Santana, BS; Cantante, I; Jorge, A; Nunes, S; Silvano, P; Leal, A; Campos, R;
Publicação
Proceedings of Text2Story - Fourth Workshop on Narrative Extraction From Texts held in conjunction with the 43rd European Conference on Information Retrieval (ECIR 2021), Lucca, Italy, April 1, 2021 (online event due to Covid-19 outbreak).
Abstract
Narrative Extraction from text is a complex task that starts by identifying a set of narrative elements (actors, events, times), and the semantic links between them (temporal, referential, semantic roles). The outcome is a structure or set of structures which can then be represented graphically, thus opening room for further and alternative exploration of the plot. Such visualization can also be useful during the on-going annotation process. Manual annotation of narratives can be a complex effort and the possibility offered by the Brat annotation tool of annotating directly on the text does not seem sufficiently helpful. In this paper, we propose Brat2Viz, a tool and a pipeline that displays visualization of narrative information annotated in Brat. Brat2Viz reads the annotation file of Brat, produces an intermediate representation in the declarative language DRS (Discourse Representation Structure), and from this obtains the visualization. Currently, we make available two visualization schemes: MSC (Message Sequence Chart) and Knowledge Graphs. The modularity of the pipeline enables the future extension to new annotation sources, different annotation schemes, and alternative visualizations or representations. We illustrate the pipeline using examples from an European Portuguese news corpus. Copyright © by the paper's authors.
2023
Autores
Silvano, P; Amorim, E; Leal, A; Cantante, I; Silva, F; Jorge, A; Campos, R; Nunes, S;
Publicação
Proceedings of Text2Story - Sixth Workshop on Narrative Extraction From Texts held in conjunction with the 45th European Conference on Information Retrieval (ECIR 2023), Dublin, Ireland, April 2, 2023.
Abstract
News articles typically include reporting events to inform on what happened. These reporting events are not part of the story being told but are nonetheless a relevant part of the news and can pose a challenge to the computational processing of news narratives. They compose a reporting narrative, which is the present study's focus. This paper aims to demonstrate through selected use cases how a comprehensive annotation scheme with suitable tags and links can properly represent the reporting events and the way they relate to the events that make the story. In addition, we put forward a proposal for their visual representation that enables a systematic and detailed analysis of the importance of reporting events in the news structure. Finally, we describe some lexico-grammatical features of reporting events, which can contribute to their automatic detection. © 2023 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).
2023
Autores
Santana, B; Campos, R; Amorim, E; Jorge, A; Silvano, P; Nunes, S;
Publicação
ARTIFICIAL INTELLIGENCE REVIEW
Abstract
Narratives are present in many forms of human expression and can be understood as a fundamental way of communication between people. Computational understanding of the underlying story of a narrative, however, may be a rather complex task for both linguists and computational linguistics. Such task can be approached using natural language processing techniques to automatically extract narratives from texts. In this paper, we present an in depth survey of narrative extraction from text, providing a establishing a basis/framework for the study roadmap to the study of this area as a whole as a means to consolidate a view on this line of research. We aim to fulfill the current gap by identifying important research efforts at the crossroad between linguists and computer scientists. In particular, we highlight the importance and complexity of the annotation process, as a crucial step for the training stage. Next, we detail methods and approaches regarding the identification and extraction of narrative components, their linkage and understanding of likely inherent relationships, before detailing formal narrative representation structures as an intermediate step for visualization and data exploration purposes. We then move into the narrative evaluation task aspects, and conclude this survey by highlighting important open issues under the domain of narratives extraction from texts that are yet to be explored.
2023
Autores
Caseli, H; Amorim, E; Schneider, ETR; Freitas, LIA; Rodrigues, J; Nunes, MdGV;
Publicação
Anais do XVII Women in Information Technology (WIT 2023)
Abstract
2024
Autores
Nunes, S; Jorge, AM; Amorim, E; Sousa, HO; Leal, A; Silvano, PM; Cantante, I; Campos, R;
Publicação
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, LREC/COLING 2024, 20-25 May, 2024, Torino, Italy.
Abstract
Narratives have been the subject of extensive research across various scientific fields such as linguistics and computer science. However, the scarcity of freely available datasets, essential for studying this genre, remains a significant obstacle. Furthermore, datasets annotated with narratives components and their morphosyntactic and semantic information are even scarcer. To address this gap, we developed the Text2Story Lusa datasets, which consist of a collection of news articles in European Portuguese. The first datasets consists of 357 news articles and the second dataset comprises a subset of 117 manually densely annotated articles, totaling over 50 thousand individual annotations. By focusing on texts with substantial narrative elements, we aim to provide a valuable resource for studying narrative structures in European Portuguese news articles. On the one hand, the first dataset provides researchers with data to study narratives from various perspectives. On the other hand, the annotated dataset facilitates research in information extraction and related tasks, particularly in the context of narrative extraction pipelines. Both datasets are made available adhering to FAIR principles, thereby enhancing their utility within the research community.
2024
Autores
de Souza, MC; Golo, MPS; Jorge, AMG; de Amorim, ECF; Campos, RNT; Marcacini, RM; Rezende, SO;
Publicação
INFORMATION SCIENCES
Abstract
Fake news detection (FND) tools are essential to increase the reliability of information in social media. FND can be approached as a machine learning classification problem so that discriminative features can be automatically extracted. However, this requires a large news set, which in turn implies a considerable amount of human experts' effort for labeling. In this paper, we explore Positive and Unlabeled Learning (PUL) to reduce the labeling cost. In particular, we improve PUL with the network-based Label Propagation (PU-LP) algorithm. PU-LP achieved competitive results in FND exploiting relations between news and terms and using few labeled fake news. We propose integrating an attention mechanism in PU-LP that can define which terms in the network are more relevant for detecting fake news. We use GNEE, a state-of-the-art algorithm based on graph attention networks. Our proposal outperforms state-of-the-art methods, improving F-1 in 2% to 10%, especially when only 10% labeled fake news are available. It is competitive with the binary baseline, even when nearly half of the data is labeled. Discrimination ability is also visualized through t-SNE. We also present an analysis of the limitations of our approach according to the type of text found in each dataset.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.