Publications

Publications by LIAAD

2018

YAKE! Collection-Independent Automatic Keyword Extractor

Authors
Campos, R; Mangaravite, V; Pasquali, A; Jorge, AM; Nunes, C; Jatowt, A;

Publication
ADVANCES IN INFORMATION RETRIEVAL (ECIR 2018)

Abstract
In this paper, we present YAKE!, a novel feature-based system for multi-lingual keyword extraction from single documents, which supports texts of different sizes, domains or languages. Unlike most systems, YAKE! does not rely on dictionaries or thesauri, neither it is trained against any corpora. Instead, we follow an unsupervised approach which builds upon features extracted from the text, making it thus applicable to documents written in many different languages without the need for external knowledge. This can be beneficial for a large number of tasks and a plethora of situations where the access to training corpora is either limited or restricted. In this demo, we offer an easy to use, interactive session, where users from both academia and industry can try our system, either by using a sample document or by introducing their own text. As an add-on, we compare our extracted keywords against the output produced by the IBM Natural Language Understanding (IBM NLU) and Rake system. YAKE! demo is available at http://bit.ly/YakeDemoECIR2018. A python implementation of YAKE! is also available at PyPi repository (https://pypi.python.org/pypi/yake/).

CloseRead Abstract

2018

Forgetting techniques for stream-based matrix factorization in recommender systems

Authors
Matuszyk, P; Vinagre, J; Spiliopoulou, M; Jorge, AM; Gama, J;

Publication
KNOWLEDGE AND INFORMATION SYSTEMS

Abstract
Forgetting is often considered a malfunction of intelligent agents; however, in a changing world forgetting has an essential advantage. It provides means of adaptation to changes by removing effects of obsolete (not necessarily old) information from models. This also applies to intelligent systems, such as recommender systems, which learn users' preferences and predict future items of interest. In this work, we present unsupervised forgetting techniques that make recommender systems adapt to changes of users' preferences over time. We propose eleven techniques that select obsolete information and three algorithms that enforce the forgetting in different ways. In our evaluation on real-world datasets, we show that forgetting obsolete information significantly improves predictive power of recommender systems.

CloseRead Abstract

2018

Proceedings of the First Workshop on Narrative Extraction From Text (Text2Story 2018) co-located with 40th European Conference on Information Retrieval (ECIR 2018), Grenoble, France, March 26, 2018

Authors
Jorge, AM; Campos, R; Jatowt, A; Nunes, S;

Publication
Text2Story@ECIR

Abstract

2018

First International Workshop on Narrative Extraction from Texts: Text2Story 2018

Authors
Jorge, AM; Campos, R; Jatowt, A; Nunes, S;

Publication
ADVANCES IN INFORMATION RETRIEVAL (ECIR 2018)

Abstract

2018

Preface

Authors
Jorge, AM; Campos, R; Jatowt, A; Nunes, S;

Publication
CEUR Workshop Proceedings

Abstract

2018

Incremental Matrix Co-factorization for Recommender Systems with Implicit Feedback

Authors
Anyosa, SC; Vinagre, J; Jorge, AM;

Publication
Companion of the The Web Conference 2018 on The Web Conference 2018, WWW 2018, Lyon , France, April 23-27, 2018

Abstract
Recommender systems try to predict which items a user will prefer. Traditional models for recommendation only take into account the user-item interaction, usually expressed by explicit ratings. However, in these days, web services continuously generate auxiliary data from users and items that can be incorporated into the recommendation model to improve recommendations. In this work, we propose an incremental Matrix Co-factorization model with implicit user feedback, considering a real-world data-stream scenario. This model can be seen as an extension of the conventional Matrix Factorization that includes additional dimensions to be decomposed in the common latent factor space. We test our proposal against a baseline algorithm that relies exclusively on interaction data, using prequential evaluation. Our experimental results show a significant improvement in the accuracy of recommendations, after incorporating an additional dimension in three music domain datasets. © 2018 IW3C2 (International World Wide Web Conference Committee), published under Creative Commons CC BY 4.0 License.

CloseRead Abstract