Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by LIAAD

2020

BRIGHT-Drift-Aware Demand Predictions for Taxi Networks

Authors
Saadallah, A; Moreira Matias, L; Sousa, R; Khiari, J; Jenelius, E; Gama, J;

Publication
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING

Abstract
Massive data broadcast by GPS-equipped vehicles provide unprecedented opportunities. One of the main tasks in order to optimize our transportation networks is to build data-driven real-time decision support systems. However, the dynamic environments where the networks operate disallow the traditional assumptions required to put in practice many off-the-shelf supervised learning algorithms, such as finite training sets or stationary distributions. In this paper, we propose BRIGHT: a drift-aware supervised learning framework to predict demand quantities. BRIGHT aims to provide accurate predictions for short-term horizons through a creative ensemble of time series analysis methods that handles distinct types of concept drift. By selecting neighborhoods dynamically, BRIGHT reduces the likelihood of overfitting. By ensuring diversity among the base learners, BRIGHT ensures a high reduction of variance while keeping bias stable. Experiments were conducted using three large-scale heterogeneous real-world transportation networks in Porto (Portugal), Shanghai (China), and Stockholm (Sweden), as well as with controlled experiments using synthetic data where multiple distinct drifts were artificially induced. The obtained results illustrate the advantages of BRIGHT in relation to state-of-the-art methods for this task.

2020

Transfer Learning in urban object classification: Online images to recognize point clouds

Authors
Balado, J; Sousa, R; Diaz Vilarino, L; Arias, P;

Publication
AUTOMATION IN CONSTRUCTION

Abstract
The application of Deep Learning techniques to point clouds for urban object classification is limited by the large number of samples needed. Acquiring and tagging point clouds is more expensive and tedious labour than its image equivalent process. Point cloud online datasets contain few samples for Deep Learning or not always the desired classes This work focuses on minimizing the use of point cloud samples for neural network training in urban object classification. The method proposed is based on the conversion of point clouds to images (pc-images) because it enables: the use of Convolutional Neural Networks, the generation of several samples (images) per object (point clouds) by means of multi-view, and the combination of pc-images with images from online datasets (ImageNet and Google Images). The study is conducted with ten classes of objects extracted from two street point clouds from two different cities. The network selected for the job is the InceptionV3. The training set consists of 5000 online images with a variable percentage (0% to 10%) of pc-images. The validation and testing sets are composed exclusively of pc-images. Although the network trained only with online images reached 47% accuracy, the inclusion of a small percentage of pc-images in the training set improves the classification to 99.5% accuracy with 6% pc-images. The network is also applied at IQmulus & TerraMobilita Contest dataset and it allows the correct classification of elements with few samples.

2020

YAKE! Keyword extraction from single documents using multiple local features

Authors
Campos, R; Mangaravite, V; Pasquali, A; Jorge, A; Nunes, C; Jatowt, A;

Publication
INFORMATION SCIENCES

Abstract
As the amount of generated information grows, reading and summarizing texts of large collections turns into a challenging task. Many documents do not come with descriptive terms, thus requiring humans to generate keywords on-the-fly. The need to automate this kind of task demands the development of keyword extraction systems with the ability to automatically identify keywords within the text. One approach is to resort to machine-learning algorithms. These, however, depend on large annotated text corpora, which are not always available. An alternative solution is to consider an unsupervised approach. In this article, we describe YAKE!, a light-weight unsupervised automatic keyword extraction method which rests on statistical text features extracted from single documents to select the most relevant keywords of a text. Our system does not need to be trained on a particular set of documents, nor does it depend on dictionaries, external corpora, text size, language, or domain. To demonstrate the merits and significance of YAKE!, we compare it against ten state-of-the-art unsupervised approaches and one supervised method. Experimental results carried out on top of twenty datasets show that YAKE! significantly outperforms other unsupervised methods on texts of different sizes, languages, and domains.

2020

The 3rd International Workshop on Narrative Extraction from Texts: Text2Story 2020

Authors
Campos, R; Jorge, A; Jatowt, A; Bhatia, S;

Publication
Advances in Information Retrieval - 42nd European Conference on IR Research, ECIR 2020, Lisbon, Portugal, April 14-17, 2020, Proceedings, Part II

Abstract
The Third International Workshop on Narrative Extraction from Texts (Text2Story’20) [text2story20.inesctec.pt] held in conjunction with the 42nd European Conference on Information Retrieval (ECIR 2020) gives researchers of IR, NLP and other fields, the opportunity to share their recent advances in extraction and formal representation of narratives. This workshop also presents a forum to consolidate the multi-disciplinary efforts and foster discussions around the narrative extraction task, a hot topic in recent years. © Springer Nature Switzerland AG 2020.

2020

Proceedings of Text2Story - Third Workshop on Narrative Extraction From Texts co-located with 42nd European Conference on Information Retrieval, Text2Story@ECIR 2020, Lisbon, Portugal, April 14th, 2020 [online only]

Authors
Campos, R; Jorge, AM; Jatowt, A; Bhatia, S;

Publication
Text2Story@ECIR

Abstract

2020

Incremental Approach for Automatic Generation of Domain-Specific Sentiment Lexicon

Authors
Muhammad, SH; Brazdil, P; Jorge, A;

Publication
Advances in Information Retrieval - 42nd European Conference on IR Research, ECIR 2020, Lisbon, Portugal, April 14-17, 2020, Proceedings, Part II

Abstract
Sentiment lexicon plays a vital role in lexicon-based sentiment analysis. The lexicon-based method is often preferred because it leads to more explainable answers in comparison with many machine learning-based methods. But, semantic orientation of a word depends on its domain. Hence, a general-purpose sentiment lexicon may gives sub-optimal performance compare with a domain-specific lexicon. However, it is challenging to manually generate a domain-specific sentiment lexicon for each domain. Still, it is impractical to generate complete sentiment lexicon for a domain from a single corpus. To this end, we propose an approach to automatically generate a domain-specific sentiment lexicon using a vector model enriched by weights. Importantly, we propose an incremental approach for updating an existing lexicon to either the same domain or different domain (domain-adaptation). Finally, we discuss how to incorporate sentiment lexicons information in neural models (word embedding) for better performance. © Springer Nature Switzerland AG 2020.

  • 96
  • 429