2006
Autores
Lopes, CT; David, G;
Publicação
COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2006, PT 4
Abstract
Usage analysis of a Web Information System is a valuable help to predict user needs, to assess system's impact and to guide to its improvement. This is usually done analysing clickstreams, a low-level approach, with huge amounts of data that calls for data warehouse techniques. This paper presents a dimensional model to monitor user behaviour in Higher Education Web Information Systems and an architecture for the extraction, transformation and load process. These have been applied in the development of a data warehouse to monitor the use of SIGARRA, the University of Porto's Higher Education Web Information System. The efficiency and effectiveness of this monitorization method were confirmed by the knowledge extracted from a 3 month period analysis. A brief description of the main results and recommendations are also described.
2011
Autores
Nunes, S; Ribeiro, C; David, G;
Publicação
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY
Abstract
In real-world information retrieval systems, the underlying document collection is rarely stable or definitive. This work is focused on the study of signals extracted from the content of documents at different points in time for the purpose of weighting individual terms in a document. The basic idea behind our proposals is that terms that have existed for a longer time in a document should have a greater weight. We propose 4 term weighting functions that use each document's history to estimate a current term score. To evaluate this thesis, we conduct 3 independent experiments using a collection of documents sampled from Wikipedia. In the first experiment, we use data from Wikipedia to judge each set of terms. In a second experiment, we use an external collection of tags from a popular social bookmarking service as a gold standard. In the third experiment, we crowdsource user judgments to collect feedback on term preference. Across all experiments results consistently support our thesis. We show that temporally aware measures, specifically the proposed revision term frequency and revision term frequency span, outperform a term-weighting measure based on raw term frequency alone.
2003
Autores
Nogueira, V; Abreu, S; David, G;
Publicação
VIII Jornadas Ingeniería del Software y Bases de Datos (JISBD 2003), 12-14 Noviembre 2003, Alicante
Abstract
2001
Autores
Ribeiro, C; David, G;
Publicação
ICHIM (1)
Abstract
1990
Autores
David, G; Porto, A;
Publicação
Proceedings of the ICLP 1990 Workshop on Logic Programming Environments, Eilat, Israel, June 16, 1990. Technical Report, ECRC IR-LP-31-25
Abstract
2008
Autores
Nunes, S; Ribeiro, C; David, G;
Publicação
NIST Special Publication
Abstract
This paper presents the participation of FEUP, from University of Porto, in the TREC 2008 Blog Track. FEUP participated in two tasks, the baseline adhoc retrieval task and the blog finding distillation task. Our approach was focused on the use of the temporal information available in the TREC Blog06 collection. For the baseline adhoc retrieval task a simple temporal sort was evaluated. In the blog finding distillation task we tested three alternative scoring functions based on temporal evidence. All features were combined with a BM25 baseline run using a standard rank aggregation approach. We observed small, but statistically significant, improvements in several evaluation measures when temporal information is used.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.