2016
Autores
Pasquali, A; Canavarro, M; Campos, R; Jorge, AM;
Publicação
Proceedings of the Ninth International C* Conference on Computer Science & Software Engineering, C3S2E '16, Porto, Portugal, July 20-22, 2016
Abstract
Automatic topic detection in document collections is an important tool for various tasks. In particular, it is valuable for studying and understanding socio-political phenomena. A currently relevant example is the automatic analysis of streams of posts issued by different activist groups in the current Brazilian turmoil, through the analysis of the generated streams of texts published on the web. It is useful to determine the relative importance of the different topics identified. We can find in the literature proposals for measuring topic relevance. In this paper, we adopt two of such measures and apply them to data sets extracted from Facebook pages related to Brazilian political activism. On top of the analysis, we then carry an experimental evaluation of the human interpretability for these two measures by comparing their outcomes with the opinion of three Brazilian professionals from the field of Communication Science and media-activists. Copyright 2016 ACM.
2014
Autores
Campos, R; Dias, G; Jorge, AM; Nunes, C;
Publicação
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Abstract
In this paper, we present GTE-Cluster an online temporal search interface which consistently allows searching for topics in a temporal perspective by clustering relevant temporal Web search results. GTE-Cluster is designed to improve user experience by augmenting document relevance with temporal relevance. The rationale is that offering the user a comprehensive temporal perspective of a topic is intuitively more informative than retrieving a result that only contains topical information. Our system does not pose any constraint in terms of language or domain, thus users can issue queries in any language ranging from business, cultural, political to musical perspective, to cite just a few. The ability to exploit this information in a temporal manner can be, from a user perspective, potentially useful for several tasks, including user query understanding or temporal clustering. © 2014 Springer International Publishing Switzerland.
2015
Autores
Derczynski, L; Stroetgen, J; Campos, R; Alonso, O;
Publicação
INFORMATION PROCESSING & MANAGEMENT
Abstract
The Special Issue of Information Processing and Management includes research papers on the intersection between time and information retrieval. In 'Evaluating Document Filtering Systems over Time', Tom Kenter and Krisztian Balog propose a time-aware way of measuring a system's performance at filtering documents. Manika Kar, SeAa7acute;rgio Nunes and Cristina Ribeiro present interesting methods for summarizing changes in dynamic text collections over time in their paper 'Summarization of Changes in Dynamic Text Collection using Latent Dirichlet Allocation Model.' Hideo Joho, Adam Jatowt and Roi Blanco report on the temporal information searching behaviour of users and their strategies for dealing with searches that have a temporal nature in 'Temporal Information Searching Behaviour and Strategies', a user study. In controlled settings, thirty participants are asked to perform searches on an array of topics on the web to find information related to particular time scopes. Adam Jatowt, Ching-man Au Yeung and Katsumi Tanaka present a 'Generic Method for Detecting Content Time of Documents'. The authors propose several methods for estimating the focus time of documents, i.e. the time a document's content refers to. Xujian Zhao, Peiquan Jin and Lihua Yue present an approach to determining the time of the underlying topic or event in their article entitled 'Discovering Topic Time from Web News'.
2017
Autores
Campos, R; Dias, G; Jorge, AM; Nunes, C;
Publicação
INFORMATION RETRIEVAL JOURNAL
Abstract
Despite a clear improvement of search and retrieval temporal applications, current search engines are still mostly unaware of the temporal dimension. Indeed, in most cases, systems are limited to offering the user the chance to restrict the search to a particular time period or to simply rely on an explicitly specified time span. If the user is not explicit in his/her search intents (e.g., "philip seymour hoffman'') search engines may likely fail to present an overall historic perspective of the topic. In most such cases, they are limited to retrieving the most recent results. One possible solution to this shortcoming is to understand the different time periods of the query. In this context, most state-of-the-art methodologies consider any occurrence of temporal expressions in web documents and other web data as equally relevant to an implicit time sensitive query. To approach this problem in a more adequate manner, we propose in this paper the detection of relevant temporal expressions to the query. Unlike previous metadata and query log-based approaches, we show how to achieve this goal based on information extracted from document content. However, instead of simply focusing on the detection of the most obvious date we are also interested in retrieving the set of dates that are relevant to the query. Towards this goal, we define a general similarity measure that makes use of co-occurrences of words and years based on corpus statistics and a classification methodology that is able to identify the set of top relevant dates for a given implicit time sensitive query, while filtering out the non-relevant ones. Through extensive experimental evaluation, we mean to demonstrate that our approach offers promising results in the field of temporal information retrieval (T-IR), as demonstrated by the experiments conducted over several baselines on web corpora collections.
2014
Autores
Campos, R; Dias, G; Jorge, AM; Nunes, C;
Publicação
Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, CIKM 2014, Shanghai, China, November 3-7, 2014
Abstract
Temporal information retrieval has been a topic of great interest in recent years. Despite the efforts that have been conducted so far, most popular search engines remain underdeveloped when it comes to explicitly considering the use of temporal information in their search process. In this paper we present GTE-Rank, an online searching tool that takes time into account when ranking time-sensitive query web search results. GTE-Rank is defined as a linear combination of topical and temporal scores to reflect the relevance of any web page both in topical and temporal dimensions. The resulting system can be explored graphically through a search interface made available for research purposes.
2015
Autores
Campos, R; Dias, G; Jorge, AM; Jatowt, A;
Publicação
ACM COMPUTING SURVEYS
Abstract
Temporal information retrieval has been a topic of great interest in recent years. Its purpose is to improve the effectiveness of information retrieval methods by exploiting temporal information in documents and queries. In this article, we present a survey of the existing literature on temporal information retrieval. In addition to giving an overview of the field, we categorize the relevant research, describe the main contributions, and compare different approaches. We organize existing research to provide a coherent view, discuss several open issues, and point out some possible future research directions in this area. Despite significant advances, the area lacks a systematic arrangement of prior efforts and an overview of state-of-the-art approaches. Moreover, an effective end-to-end temporal retrieval system that exploits temporal information to improve the quality of the presented results remains undeveloped.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.