2009
Authors
Borges, J; Levene, M;
Publication
- Encyclopedia of Data Warehousing and Mining, Second Edition
Abstract
2012
Authors
Borges, J; Real, AC; Cabral, JS; Jones, GV;
Publication
Journal of Wine Economics - J Wine Econ
Abstract
2000
Authors
Borges, J; Levene, M;
Publication
SIGKDD Explor. Newsl. - ACM SIGKDD Explorations Newsletter
Abstract
2007
Authors
Borges, J; Levene, M;
Publication
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
Abstract
Markov models have been widely used to represent and analyze user Web navigation data. In previous work, we have proposed a method to dynamically extend the order of a Markov chain model and a complimentary method for assessing the predictive power of such a variable-length Markov chain. Herein, we review these two methods and propose a novel method for measuring the ability of a variable-length Markov model to summarize user Web navigation sessions up to a given length. Although the summarization ability of a model is important to enable the identification of user navigation patterns, the ability to make predictions is important in order to foresee the next link choice of a user after following a given trail so as, for example, to personalize a Web site. We present an extensive experimental evaluation providing strong evidence that prediction accuracy increases linearly with summarization ability.
2006
Authors
Borges, J; Levene, M;
Publication
WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS
Abstract
We compare two link analysis ranking methods of web pages in a site. The first, called Site Rank, is an adaptation of PageRank to the granularity of a web site and the second, called Popularity Rank, is based on the frequencies of user clicks on the outlinks in a page that are captured by navigation sessions of users through the web site. We ran experiments on artificially created web sites of different sizes and on two real data sets, employing the relative entropy to compare the distributions of the two ranking methods. For the real data sets we also employ a nonparametric measure, called Spearman's footrule, which we use to compare the top-ten web pages ranked by the two methods. Our main result is that the distributions of the Popularity Rank and Site Rank are surprisingly close to each other, implying that the topology of a web site is very instrumental in guiding users through the site. Thus, in practice, the Site Rank provides a reasonable first order approximation of the aggregate behaviour of users within a web site given by the Popularity Rank.
2007
Authors
Borges, J; Levene, M;
Publication
SOFT COMPUTING
Abstract
We present two methods for testing the predictive power of a variable length Markov chain induced from a collection of user web navigation sessions. The collection of sessions is split into a training and a test set. The first method uses a chi(2) statistical test to measure the significance of the distance between the distribution of the probabilities assigned to the test trails by a Markov model build from the full collection of sessions and a model built from the training set. The statistical test measures the ability of the model to generalise its predictions to the unseen sessions from the test set. The second method evaluates the model ability to predict the last page of a navigation session based on the preceding pages viewed by recording the mean absolute error of the rank of the last occurring page among the predictions provided by the model. Experimental results conducted on both real and random data sets are reported and the results show that in most cases a second-order model is able to capture sufficient history to predict the next link choice with high accuracy.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.