Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by LIAAD

2021

Brat2Viz: a Tool and Pipeline for Visualizing Narratives from Annotated Texts

Authors
Amorim, E; Ribeiro, A; Santana, BS; Cantante, I; Jorge, A; Nunes, S; Silvano, P; Leal, A; Campos, R;

Publication
Proceedings of Text2Story - Fourth Workshop on Narrative Extraction From Texts held in conjunction with the 43rd European Conference on Information Retrieval (ECIR 2021), Lucca, Italy, April 1, 2021 (online event due to Covid-19 outbreak).

Abstract
Narrative Extraction from text is a complex task that starts by identifying a set of narrative elements (actors, events, times), and the semantic links between them (temporal, referential, semantic roles). The outcome is a structure or set of structures which can then be represented graphically, thus opening room for further and alternative exploration of the plot. Such visualization can also be useful during the on-going annotation process. Manual annotation of narratives can be a complex effort and the possibility offered by the Brat annotation tool of annotating directly on the text does not seem sufficiently helpful. In this paper, we propose Brat2Viz, a tool and a pipeline that displays visualization of narrative information annotated in Brat. Brat2Viz reads the annotation file of Brat, produces an intermediate representation in the declarative language DRS (Discourse Representation Structure), and from this obtains the visualization. Currently, we make available two visualization schemes: MSC (Message Sequence Chart) and Knowledge Graphs. The modularity of the pipeline enables the future extension to new annotation sources, different annotation schemes, and alternative visualizations or representations. We illustrate the pipeline using examples from an European Portuguese news corpus. Copyright © by the paper's authors.

2021

ORSUM 2021-4th Workshop on Online Recommender Systems and User Modeling

Authors
Vinagre, J; Jorge, AM; Al Ghossein, M; Bifet, A;

Publication
15TH ACM CONFERENCE ON RECOMMENDER SYSTEMS (RECSYS 2021)

Abstract
Modern online services continuously generate data at very fast rates. This continuous flow of data encompasses content - e.g. posts, news, products, comments -, but also user feedback - e.g. ratings, views, reads, clicks -, together with context data - user device, spacial or temporal data, user task or activity, weather. This can be overwhelming for systems and algorithms designed to train in batches, given the continuous and potentially fast change of content, context and user preferences or intents. Therefore, it is important to investigate online methods able to transparently adapt to the inherent dynamics of online services. Incremental models that learn from data streams are gaining attention in the recommender systems community, given their natural ability to deal with the continuous flows of data generated in dynamic, complex environments. User modeling and personalization can particularly benefit from algorithms capable of maintaining models incrementally and online. The objective of this workshop is to foster contributions and bring together a growing community of researchers and practitioners interested in online, adaptive approaches to user modeling, recommendation and personalization, and their implications regarding multiple dimensions, such as evaluation, reproducibility, privacy and explainability.

2021

Automatic generation of timelines for past-web events

Authors
Campos, R; Pasquali, A; Jatowt, A; Mangaravite, V; Jorge, AM;

Publication
The Past Web: Exploring Web Archives

Abstract
Despite significant advances in web archive infrastructures, the problem of exploring the historical heritage preserved by web archives is yet to be solved. Timeline generation emerges in this context as one possible solution for automatically producing summaries of news over time. Thanks to this, users can gain a better sense of reported news events, entities, stories or topics over time, such as getting a summary of the most important news about a politician, an organisation or a locality. Web archives play an important role here by providing access to a historical set of preserved information. This particular characteristic of web archives makes them an irreplaceable infrastructure and a valuable source of knowledge that contributes to the process of timeline generation. Accordingly, the authors of this chapter developed "Tell me Stories" (), a news summarisation system, built on top of the infrastructure of Arquivo.pt-the Portuguese web-archive-to automatically generate a timeline summary of a given topic. In this chapter, we begin by providing a brief overview of the most relevant research conducted on the automatic generation of timelines for past-web events. Next, we describe the architecture and some use cases for "Tell me Stories". Our system demonstrates how web archives can be used as infrastructures to develop innovative services. We conclude this chapter by enumerating open challenges in this field and possible future directions in the general area of temporal summarisation in web archives. © Springer Nature Switzerland AG 2021. All rights reserved.

2021

Do we really need a segmentation step in heart sound classification algorithms?

Authors
Oliveira, J; Nogueira, D; Renna, F; Ferreira, C; Jorge, AM; Coimbra, M;

Publication
2021 43RD ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE & BIOLOGY SOCIETY (EMBC)

Abstract
Cardiac auscultation is the key screening procedure to detect and identify cardiovascular diseases (CVDs). One of many steps to automatically detect CVDs using auscultation, concerns the detection and delimitation of the heart sound boundaries, a process known as segmentation. Whether to include or not a segmentation step in the signal classification pipeline is nowadays a topic of discussion. Up to our knowledge, the outcome of a segmentation algorithm has been used almost exclusively to align the different signal segments according to the heartbeat. In this paper, the need for a heartbeat alignment step is tested and evaluated over different machine learning algorithms, including deep learning solutions. From the different classifiers tested, Gate Recurrent Unit (GRU) Network and Convolutional Neural Network (CNN) algorithms are shown to be the most robust. Namely, these algorithms can detect the presence of heart murmurs even without a heartbeat alignment step. Furthermore, Support Vector Machine (SVM) and Random Forest (RF) algorithms require an explicit segmentation step to effectively detect heart sounds and murmurs, the overall performance is expected drop approximately 5% on both cases.

2021

Report on the 4th international workshop on narrative extraction from texts (Text2Story 2021) at ECIR 2021

Authors
Campos, R; Jorge, AM; Jatowt, A; Bhatia, S; Finlayson, MA; Cordeiro, JP; Rocha, C; Ribeiro, A; Mansouri, B; Ansah, J; Pasquali, A;

Publication
SIGIR Forum

Abstract

2021

Improving Portuguese Semantic Role Labeling with Transformers and Transfer Learning

Authors
Oliveira, S; Loureiro, D; Jorge, A;

Publication
2021 IEEE 8TH INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA)

Abstract
The Natural Language Processing task of determining Who did what to whom is called Semantic Role Labeling. For English, recent methods based on Transformer models have allowed for major improvements in this task over the previous state of the art. However, for low resource languages, like Portuguese, currently available semantic role labeling models are hindered by scarce training data. In this paper, we explore a model architecture with only a pre-trained Transformer-based model, a linear layer, softmax and Viterbi decoding. We substantially improve the state-of-the-art performance in Portuguese by over 15 F1. Additionally, we improve semantic role labeling results in Portuguese corpora by exploiting cross-lingual transfer learning using multilingual pre-trained models, and transfer learning from dependency parsing in Portuguese, evaluating the various proposed approaches empirically.

  • 71
  • 429