2018
Autores
Sousa, R; Gama, J;
Publicação
33RD ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING
Abstract
This paper describes the development of a Co-training (semi-supervised approach) method that uses multiple learners for single target regression on data streams. The experimental evaluation was focused on the comparison between a realistic supervised scenario (all unlabelled examples are discarded) and scenarios where unlabelled examples are used to improve the regression model. Results present fair evidences of error measure reduction by using the proposed Co-training method. However, the error reduction still is relatively small.
2018
Autores
Fernandes, S; Fanaee T, H; Gama, J;
Publicação
DATA MINING AND KNOWLEDGE DISCOVERY
Abstract
Due to the scale and complexity of todays' social networks, it becomes infeasible to mine them with traditional approaches. A possible solution to reduce such scale and complexity is to produce a compact (lossy) version of the network that represents its major properties. This task is known as graph summarization, which is the subject of this research. Our focus is on time-evolving graphs, a more complex scenario where the dynamics of the network also should be taken into account. We address this problem using tensor decomposition, which enables us to capture the multi-way structure of the time-evolving network. This property is unique and is impossible to obtain with other approaches such as matrix factorization. Experimental evaluation on five real world networks implies promising results demonstrating that tensor decomposition is quite useful for summarizing dynamic networks.
2018
Autores
Tabassum, S; Pereira, FSF; Fernandes, S; Gama, J;
Publicação
WILEY INTERDISCIPLINARY REVIEWS-DATA MINING AND KNOWLEDGE DISCOVERY
Abstract
Social network analysis (SNA) is a core pursuit of analyzing social networks today. In addition to the usual statistical techniques of data analysis, these networks are investigated using SNA measures. It helps in understanding the dependencies between social entities in the data, characterizing their behaviors and their effect on the network as a whole and over time. Therefore, this article attempts to provide a succinct overview of SNA in diverse topological networks (static, temporal, and evolving networks) and perspective (ego-networks). As one of the primary applicability of SNA is in networked data mining, we provide a brief overview of network mining models as well; by this, we present the readers with a concise guided tour from analysis to mining of networks. This article is categorized under: Application Areas > Science and Technology Technologies > Machine Learning Fundamental Concepts of Data and Knowledge > Human Centricity and User Interaction Commercial, Legal, and Ethical Issues > Social Considerations
2018
Autores
Vinagre, J; Jorge, AM; Gama, J;
Publicação
EXPERT SYSTEMS
Abstract
Ensemble methods have been successfully used in the past to improve recommender systems; however, they have never been studied with incremental recommendation algorithms. Many online recommender systems deal with continuous, potentially fast, and unbounded flows of databig data streamsand often need to be responsive to fresh user feedback, adjusting recommendations accordingly. This is clear in tasks such as social network feeds, news recommender systems, automatic playlist completion, and other similar applications. Batch ensemble approaches are not suitable to perform continuous learning, given the complexity of retraining new models on demand. In this paper, we adapt a general purpose online bagging algorithm for top-N recommendation tasks and propose two novel online bagging methods specifically tailored for recommender systems. We evaluate the three approaches, using an incremental matrix factorization algorithm for top-N recommendation with positive-only user feedback data as the base model. Our results show that online bagging is able to improve accuracy up to 55% over the baseline, with manageable computational overhead.
2019
Autores
Pereira, FSF; Tabassum, S; Gama, J; de Amo, S; Oliveira, GMB;
Publicação
Studies in Big Data
Abstract
Social networks have an evolving characteristic due to the continuous interaction between users, with nodes associating and disassociating with each other as time flies. The analysis of such networks is especially challenging, because it needs to be performed with an online approach, under the one-pass constraint of data streams. Such evolving behavior leads to changes in the network topology that can be investigated under different perspectives. In this work we focus on the analysis of nodes position evolution—a node-centric perspective. Our goal is to spot change-points in an evolving network at which a node deviates from its normal behavior. Therefore, we propose a change detection model for processing evolving network streams which employs three different aggregating mechanisms for tracking the evolution of centrality metrics of a node. Our model is space and time efficient with memory less mechanisms and in other mechanisms at most we require the network of current time step T only. Additionally, we also compare the influence on different centralities’ fluctuations by the dynamics of real-world preferences. Consecutively, we apply our model in the user preference change detection task, reaching competitive levels of accuracy on Twitter network. © 2019, Springer International Publishing AG, part of Springer Nature.
2018
Autores
Vinagre, J; Jorge, AM; Gama, J;
Publicação
Discovery Science - 21st International Conference, DS 2018, Limassol, Cyprus, October 29-31, 2018, Proceedings
Abstract
Ensemble models have been proven successful for batch recommendation algorithms, however they have not been well studied in streaming applications. Such applications typically use incremental learning, to which standard ensemble techniques are not trivially applicable. In this paper, we study the application of three variants of online gradient boosting to top-N recommendation tasks with implicit data, in a streaming data environment. Weak models are built using a simple incremental matrix factorization algorithm for implicit feedback. Our results show a significant improvement of up to 40% over the baseline standalone model. We also show that the overhead of running multiple weak models is easily manageable in stream-based applications. © 2018, Springer Nature Switzerland AG.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.