Publications

Publications by Pavel Brazdil

2018

Incremental TextRank - Automatic Keyword Extraction for Text Streams

Authors
Sarmento, RP; Cordeiro, M; Brazdil, P; Gama, J;

Publication
Proceedings of the 20th International Conference on Enterprise Information Systems, ICEIS 2018, Funchal, Madeira, Portugal, March 21-24, 2018, Volume 1.

Abstract
Text Mining and NLP techniques are a hot topic nowadays. Researchers thrive to develop new and faster algorithms to cope with larger amounts of data. Particularly, text data analysis has been increasing in interest due to the growth of social networks media. Given this, the development of new algorithms and/or the upgrade of existing ones is now a crucial task to deal with text mining problems under this new scenario. In this paper, we present an update to TextRank, a well-known implementation used to do automatic keyword extraction from text, adapted to deal with streams of text. In addition, we present results for this implementation and compare them with the batch version. Major improvements are lowest computation times for the processing of the same text data, in a streaming environment, both in sliding window and incremental setups. The speedups obtained in the experimental results are significant. Therefore the approach was considered valid and useful to the research community. Copyright

CloseRead Abstract

2018

Evolving Networks and Social Network Analysis Methods and Techniques

Authors
Cordeiro, M; Sarmento, RP; Brazdil, P; Gama, J;

Publication
Social Media and Journalism - Trends, Connections, Implications

Abstract

2019

Identifying, Ranking and Tracking Community Leaders in Evolving Social Networks

Authors
Cordeiro, M; Sarmento, RP; Brazdil, P; Kimura, M; Gama, J;

Publication
Complex Networks and Their Applications VIII - Volume 1 Proceedings of the Eighth International Conference on Complex Networks and Their Applications COMPLEX NETWORKS 2019, Lisbon, Portugal, December 10-12, 2019.

Abstract
Discovering communities in a network is a fundamental and important problem to complex networks. Find the most influential actors among its peers is a major task. If on one side, studies on community detection ignore the influence of actors and communities, on the other hand, ignoring the hierarchy and community structure of the network neglect the actor or community influence. We bridge this gap by combining a dynamic community detection method with a dynamic centrality measure. The proposed enhanced dynamic hierarchical community detection method computes centrality for nodes and aggregated communities and selects each community representative leader using the ranked centrality of every node belonging to the community. This method is then able to unveil, track, and measure the importance of main actors, network intra and inter-community structural hierarchies based on a centrality measure. The empirical analysis performed, using two temporal networks shown that the method is able to find and tracking community leaders in evolving networks. © 2020, Springer Nature Switzerland AG.

CloseRead Abstract

2019

Association and Temporality between News and Tweets

Authors
Moutinho, V; Brazdil, P; Cordeiro, J;

Publication
Proceedings of the 11th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, IC3K 2019, Volume 1: KDIR, Vienna, Austria, September 17-19, 2019.

Abstract
With the advent of social media, the boundaries of mainstream journalism and social networks are becoming blurred. User-generated content is increasing, and hence, journalists dedicate considerable time searching platforms such as Facebook and Twitter to announce, spread, and monitor news and crowd check information. Many studies have looked at social networks as news sources, but the relationship and interconnections between this type of platform and news media have not been thoroughly investigated. In this work, we have studied a series of news articles and examined a set of related comments on a social network during a period of six months. Specifically, a sample of articles from generalist Portuguese news sources published on the first semester of 2016 was clustered, and the resulting clusters were then associated with tweets of Portuguese users with the recourse to a similarity measure. Focusing on a subset of clusters, we have performed a temporal analysis by examining the evolution of the two types of documents (articles and tweets) and the timing of when they appeared. It appears that for some stories, namely Brexit and the European Football Cup, the publishing of news articles intensifies on key dates (event-oriented), while the discussion on social media is more balanced throughout the months leading up to those events. Copyright

CloseRead Abstract

2020

Incremental Approach for Automatic Generation of Domain-Specific Sentiment Lexicon

Authors
Muhammad, SH; Brazdil, P; Jorge, A;

Publication
Advances in Information Retrieval - 42nd European Conference on IR Research, ECIR 2020, Lisbon, Portugal, April 14-17, 2020, Proceedings, Part II

Abstract
Sentiment lexicon plays a vital role in lexicon-based sentiment analysis. The lexicon-based method is often preferred because it leads to more explainable answers in comparison with many machine learning-based methods. But, semantic orientation of a word depends on its domain. Hence, a general-purpose sentiment lexicon may gives sub-optimal performance compare with a domain-specific lexicon. However, it is challenging to manually generate a domain-specific sentiment lexicon for each domain. Still, it is impractical to generate complete sentiment lexicon for a domain from a single corpus. To this end, we propose an approach to automatically generate a domain-specific sentiment lexicon using a vector model enriched by weights. Importantly, we propose an incremental approach for updating an existing lexicon to either the same domain or different domain (domain-adaptation). Finally, we discuss how to incorporate sentiment lexicons information in neural models (word embedding) for better performance. © Springer Nature Switzerland AG 2020.

CloseRead Abstract

2020

Sentence Compression for Portuguese

Authors
Nobrega, FAA; Jorge, AM; Brazdil, P; Pardo, TAS;

Publication
COMPUTATIONAL PROCESSING OF THE PORTUGUESE LANGUAGE, PROPOR 2020

Abstract
The task of Sentence Compression aims at producing a shorter version of a given sentence. This task may assist many other applications, as Automatic Summarization and Text Simplification. In this paper, we investigate methods for Sentence Compression for Portuguese. We focus on machine learning-based algorithms and propose new strategies. We also create reference corpora/datasets for the area, allowing to train and to test the methods of interest. Our results show that some of our methods outperform previous initiatives for Portuguese and produce competitive results with a state of the art method in the area.

CloseRead Abstract