Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by LIAAD

2019

Identifying topic relevant hashtags in Twitter streams

Authors
Figueiredo, F; Jorge, A;

Publication
INFORMATION SCIENCES

Abstract
Hashtags have become a crucial social media tool. The categorization of posts in a simple and informal way helps to spread the content through the web. At the same time, it enables users to easily find messages within a specific topic. However, the flexibility provided to use and create a hashtag carries some problems. Equivalent expressions, like synonyms, are handled like entirely different words. On the other hand, the same hashtag may refer to different topics. In this paper, we present TORHID (Topic Relevant Hashtag Identification), a method that employs topic modeling with the purpose of retrieving and identifying hashtags relevant to a specific topic in Twitter streams, starting from a seed hashtag and resorting to a classifier to remove non relevant hashtags. The result is a network of hashtags related to the seed, that we can use to deepen the initial search.

2019

Language Modelling Makes Sense: Propagating Representations through Word Net for Full-Coverage Word Sense Disambiguation

Authors
Loureiro, D; Jorge, AM;

Publication
57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019)

Abstract
Contextual embeddings represent a new generation of semantic representations learned from Neural Language Modelling (NLM) that addresses the issue of meaning conflation hampering traditional word embeddings. In this work, we show that contextual embeddings can be used to achieve unprecedented gains in Word Sense Disambiguation (WSD) tasks. Our approach focuses on creating sense-level embeddings with full-coverage of WordNet, and without recourse to explicit knowledge of sense distributions or task-specific modelling. As a result, a simple Nearest Neighbors (k-NN) method using our representations is able to consistently surpass the performance of previous systems using powerful neural sequencing models. We also analyse the robustness of our approach when ignoring part-of-speech and lemma features, requiring disambiguation against the full sense inventory, and revealing shortcomings to be improved. Finally, we explore applications of our sense embeddings for concept-level analyses of contextual embeddings and their respective NLMs.

2019

Proceedings of Text2Story - 2nd Workshop on Narrative Extraction From Texts, co-located with the 41st European Conference on Information Retrieval, Text2Story@ECIR 2019, Cologne, Germany, April 14th, 2019

Authors
Jorge, AM; Campos, R; Jatowt, A; Bhatia, S;

Publication
Text2Story@ECIR

Abstract

2019

LIAAD at SemDeep-5 Challenge: Word-in-Context (WiC)

Authors
Loureiro, D; Jorge, A;

Publication
Proceedings of the 5th Workshop on Semantic Deep Learning, SemDeep@IJCAI 2019, Macau, China, August 12, 2019

Abstract

2019

Dataset Morphing to Analyze the Performance of Collaborative Filtering

Authors
Correia, A; Soares, C; Jorge, A;

Publication
Discovery Science - 22nd International Conference, DS 2019, Split, Croatia, October 28-30, 2019, Proceedings

Abstract
Machine Learning algorithms are often too complex to be studied from a purely analytical point of view. Alternatively, with a reasonably large number of datasets one can empirically observe the behavior of a given algorithm in different conditions and hypothesize some general characteristics. This knowledge about algorithms can be used to choose the most appropriate one given a new dataset. This very hard problem can be approached using metalearning. Unfortunately, the number of datasets available may not be sufficient to obtain reliable meta-knowledge. Additionally, datasets may change with time, by growing, shrinking and editing, due to natural actions like people buying in a e-commerce site. In this paper we propose dataset morphing as the basis of a novel methodology that can help overcome these drawbacks and can be used to better understand ML algorithms. It consists of manipulating real datasets through the iterative application of gradual transformations (morphing) and by observing the changes in the behavior of learning algorithms while relating these changes with changes in the meta features of the morphed datasets. Although dataset morphing can be envisaged in a much wider framework, we focus on one very specific instance: the study of collaborative filtering algorithms on binary data. Results show that the proposed approach is feasible and that it can be used to identify useful metafeatures to predict the best collaborative filtering algorithm for a given dataset. © Springer Nature Switzerland AG 2019.

2019

Incremental Multi-Dimensional Recommender Systems: Co-Factorization vs Tensors

Authors
Ramalho, MS; Vinagre, J; Jorge, AM; Bastos, R;

Publication
2nd Workshop on Online Recommender Systems and User Modeling, ORSUM@RecSys 2019, 19 September 2019, Copenhagen, Denmark

Abstract
The present paper sets a milestone on incremental recommender systems approaches by comparing several state-of-the-art algorithms with two different mathematical foundations - matrix and tensor factorization. Traditional Pairwise Interaction Tensor Factorization is revisited and converted into a scalable and incremental option that yields the best predictive power. A novel tensor inspired approach is described. Finally, experiments compare contextless vs context-aware scenarios, the impact of noise on the algorithms, discrepancies between time complexity and execution times, and are run on five different datasets from three different recommendation areas - music, gross retail and garment. Relevant conclusions are drawn that aim to help choosing the most appropriate algorithm to use when faced with a novel recommender tasks. © 2019 M.S. Ramalho, J. Vinagre, A.M. Jorge & R. Bastos.

  • 129
  • 429