2021
Authors
Pasquali, A; Campos, R; Ribeiro, A; Santana, BS; Jorge, A; Jatowt, A;
Publication
Advances in Information Retrieval - 43rd European Conference on IR Research, ECIR 2021, Virtual Event, March 28 - April 1, 2021, Proceedings, Part I
Abstract
The rise of social media and the explosion of digital news in the web sphere have created new challenges to extract knowledge and make sense of published information. Automated timeline generation appears in this context as a promising answer to help users dealing with this information overload problem. Formally, Timeline Summarization (TLS) can be defined as a subtask of Multi-Document Summarization (MDS) conceived to highlight the most important information during the development of a story over time by summarizing long-lasting events in a timely ordered fashion. As opposed to traditional MDS, TLS has a limited number of publicly available datasets. In this paper, we propose TLS-Covid19 dataset, a novel corpus for the Portuguese and English languages. Our aim is to provide a new, larger and multi-lingual TLS annotated dataset that could foster timeline summarization evaluation research and, at the same time, enable the study of news coverage about the COVID-19 pandemic. TLS-Covid19 consists of 178 curated topics related to the COVID-19 outbreak, with associated news articles covering almost the entire year of 2020 and their respective reference timelines as gold-standard. As a final outcome, we conduct an experimental study on the proposed dataset over two extreme baseline methods. All the resources are publicly available at https://github.com/LIAAD/tls-covid19. © 2021, Springer Nature Switzerland AG.
2021
Authors
Campos, R; Jorge, A; Jatowt, A; Bhatia, S; Finlayson, MA;
Publication
Advances in Information Retrieval - 43rd European Conference on IR Research, ECIR 2021, Virtual Event, March 28 - April 1, 2021, Proceedings, Part II
Abstract
Narrative extraction, understanding and visualization is currently a popular topic and an important tool for humans interested in achieving a deeper understanding of text. Information Retrieval (IR), Natural Language Processing (NLP) and Machine Learning (ML) already offer many instruments that aid the exploration of narrative elements in text and within unstructured data. Despite evident advances in the last couple of years the problem of automatically representing narratives in a structured form, beyond the conventional identification of common events, entities and their relationships, is yet to be solved. This workshop held virtually onApril 1st, 2021 co-located with the 43rd European Conference on Information Retrieval (ECIR’21) aims at presenting and discussing current and future directions for IR, NLP, ML and other computational fields capable of improving the automatic understanding of narratives. It includes a session devoted to regular, short and demo papers, keynote talks and space for an informal discussion of the methods, of the challenges and of the future of the area. © 2021, Springer Nature Switzerland AG.
2021
Authors
Trindade, J; Vinagre, J; Fernandes, K; Paiva, N; Jorge, A;
Publication
ADVANCES IN INTELLIGENT DATA ANALYSIS XIX, IDA 2021
Abstract
In the past decade, we have witnessed the widespread adoption of Deep Neural Networks (DNNs) in several Machine Learning tasks. However, in many critical domains, such as healthcare, finance, or law enforcement, transparency is crucial. In particular, the lack of ability to conform with prior knowledge greatly affects the trustworthiness of predictive models. This paper contributes to the trustworthiness of DNNs by promoting monotonicity. We develop a multi-layer learning architecture that handles a subset of features in a dataset that, according to prior knowledge, have a monotonic relation with the response variable. We use two alternative approaches: (i) imposing constraints on the model's parameters, and (ii) applying an additional component to the loss function that penalises non-monotonic gradients. Our method is evaluated on classification and regression tasks using two datasets. Our model is able to conform to known monotonic relations, improving trustworthiness in decision making, while simultaneously maintaining small and controllable degradation in predictive ability.
2021
Authors
Silva, C; da Silva, MF; Rodrigues, A; Silva, J; Costa, VS; Jorge, A; Dutra, I;
Publication
Recent Challenges in Intelligent Information and Database Systems - 13th Asian Conference, ACIIDS 2021, Phuket, Thailand, April 7-10, 2021, Proceedings
Abstract
This paper presents an effort to timely handle 400+ GBytes of sensor data in order to produce Predictive Maintenance (PdM) models. We follow a data-driven methodology, using state-of-the-art python libraries, such as Dask and Modin, which can handle big data. We use Dynamic Time Warping for sensors behavior description, an anomaly detection method (Matrix Profile) and forecasting methods (AutoRegressive Integrated Moving Average - ARIMA, Holt-Winters and Long Short-Term Memory - LSTM). The data was collected by various sensors in an industrial context and is composed by attributes that define their activity characterizing the environment where they are inserted, e.g. optical, temperature, pollution and working hours. We successfully managed to highlight aspects of all sensors behaviors, and produce forecast models for distinct series of sensors, despite the data dimension. © 2021, Springer Nature Singapore Pte Ltd.
2021
Authors
Campos, R; Jorge, AM; Jatowt, A; Bhatia, S; Finlayson, MA;
Publication
Text2Story@ECIR
Abstract
2020
Authors
Campos, R; Jorge, AM; Jatowt, A; Bhatia, S; Pasquali, A; Cordeiro, JP; Rocha, C; Mansouri, B; Santana, BS;
Publication
SIGIR Forum
Abstract
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.