2019
Authors
Vinagre, J; Jorge, AM; Bifet, A; Al Ghossein, M;
Publication
RECSYS 2019: 13TH ACM CONFERENCE ON RECOMMENDER SYSTEMS
Abstract
The ever-growing nature of user generated data in online systems poses obvious challenges on how we process such data. Typically, this issue is regarded as a scalability problem and has been mainly addressed with distributed algorithms able to train on massive amounts of data in short time windows. However, data is inevitably adding up at high speeds. Eventually one needs to discard or archive some of it. Moreover, the dynamic nature of data in user modeling and recommender systems, such as change of user preferences, and the continuous introduction of new users and items make it increasingly difficult to maintain up-to-date, accurate recommendation models. The objective of this workshop is to bring together researchers and practitioners interested in incremental and adaptive approaches to stream-based user modeling, recommendation and personalization, including algorithms, evaluation issues, incremental content and context mining, privacy and transparency, temporal recommendation or software frameworks for continuous learning.
2019
Authors
Figueiredo, F; Jorge, A;
Publication
INFORMATION SCIENCES
Abstract
Hashtags have become a crucial social media tool. The categorization of posts in a simple and informal way helps to spread the content through the web. At the same time, it enables users to easily find messages within a specific topic. However, the flexibility provided to use and create a hashtag carries some problems. Equivalent expressions, like synonyms, are handled like entirely different words. On the other hand, the same hashtag may refer to different topics. In this paper, we present TORHID (Topic Relevant Hashtag Identification), a method that employs topic modeling with the purpose of retrieving and identifying hashtags relevant to a specific topic in Twitter streams, starting from a seed hashtag and resorting to a classifier to remove non relevant hashtags. The result is a network of hashtags related to the seed, that we can use to deepen the initial search.
2019
Authors
Loureiro, D; Jorge, AM;
Publication
57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019)
Abstract
Contextual embeddings represent a new generation of semantic representations learned from Neural Language Modelling (NLM) that addresses the issue of meaning conflation hampering traditional word embeddings. In this work, we show that contextual embeddings can be used to achieve unprecedented gains in Word Sense Disambiguation (WSD) tasks. Our approach focuses on creating sense-level embeddings with full-coverage of WordNet, and without recourse to explicit knowledge of sense distributions or task-specific modelling. As a result, a simple Nearest Neighbors (k-NN) method using our representations is able to consistently surpass the performance of previous systems using powerful neural sequencing models. We also analyse the robustness of our approach when ignoring part-of-speech and lemma features, requiring disambiguation against the full sense inventory, and revealing shortcomings to be improved. Finally, we explore applications of our sense embeddings for concept-level analyses of contextual embeddings and their respective NLMs.
2019
Authors
Jorge, AM; Campos, R; Jatowt, A; Bhatia, S;
Publication
Text2Story@ECIR
Abstract
2019
Authors
Loureiro, D; Jorge, A;
Publication
Proceedings of the 5th Workshop on Semantic Deep Learning, SemDeep@IJCAI 2019, Macau, China, August 12, 2019
Abstract
2019
Authors
Correia, A; Soares, C; Jorge, A;
Publication
Discovery Science - 22nd International Conference, DS 2019, Split, Croatia, October 28-30, 2019, Proceedings
Abstract
Machine Learning algorithms are often too complex to be studied from a purely analytical point of view. Alternatively, with a reasonably large number of datasets one can empirically observe the behavior of a given algorithm in different conditions and hypothesize some general characteristics. This knowledge about algorithms can be used to choose the most appropriate one given a new dataset. This very hard problem can be approached using metalearning. Unfortunately, the number of datasets available may not be sufficient to obtain reliable meta-knowledge. Additionally, datasets may change with time, by growing, shrinking and editing, due to natural actions like people buying in a e-commerce site. In this paper we propose dataset morphing as the basis of a novel methodology that can help overcome these drawbacks and can be used to better understand ML algorithms. It consists of manipulating real datasets through the iterative application of gradual transformations (morphing) and by observing the changes in the behavior of learning algorithms while relating these changes with changes in the meta features of the morphed datasets. Although dataset morphing can be envisaged in a much wider framework, we focus on one very specific instance: the study of collaborative filtering algorithms on binary data. Results show that the proposed approach is feasible and that it can be used to identify useful metafeatures to predict the best collaborative filtering algorithm for a given dataset. © Springer Nature Switzerland AG 2019.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.