2008
Autores
Nunes, S; Ribeiro, C; David, G;
Publicação
WikiSym 2008 - The 4th International Symposium on Wikis, Proceedings
Abstract
Wikis are popular tools commonly used to support distributed collaborative work. Wikis can be seen as virtual scrap-books that anyone can edit without having any specific technical know-how. The Wikipedia is a flagship example of a real-word application of wikis. Due to the large scale of Wikipedia it's difficult to easily grasp much of the information that is stored in this wiki. We address one particular aspect of this issue by looking at the revision history of each article. Plotting the revision activity in a timeline we expose the complete article's history in a easily understandable format. We present WIKICHANGES, a web-based application designed to plot an article's revision timeline in real time. WIKICHANGES also includes a web browser extension that incorporates activity sparklines in the real Wikipedia. Finally, we introduce a revisions summarization task that addresses the need to understand what occurred during a given set of revisions. We present a first approach to this task using tag clouds to present the revisions made. © 2008 ACM.
2007
Autores
Calistru, C; Ribeiro, C; David, G; Rodrigues, I; Laboreiro, G;
Publicação
2007 TREC Video Retrieval Evaluation Notebook Papers
Abstract
The INESC Porto group has participated in the search task (automatic and interactive). Our approach combines high-level features (the 39 con- cepts of the LSCOM-Lite set) with low-level features. We use a large set of low-level features with the intention of analysing as many facets as possible of each shot. The aggregation of large feature sets can be time consuming as it needs to be done at query time. We have developed the BitMatrix indexing method to speed up the search process. For each shot, binary signatures in the form of bit sequences are obtained in an on-line process. At query time, the query bit signature is compared to each of the shots signatures. The automatic run performs above the median, in spite of not using any classifier or any other knowledge sources except the translation of the query into LSCOM-Lite concepts.
2007
Autores
Ribeiro, C; David, G; Calistrul, C;
Publicação
ASIAN DIGITAL LIBRARIES: LOOKING BACK 10 YEARS AND FORGING NEW FRONTIERS, PROCEEDINGS
Abstract
The paper presents a multimedia database model accounting for the representation of documents, collections and the associated metadata. Appropriate structures are provided for descriptive metadata and for metadata resulting from automatic content analysis. The model is based on the identification and unification of the main concepts in the archival standards and the audiovisual area. The main features of the model, designed to support multimedia database applications, axe the integration of descriptive and content analysis metadata, the association of metadata to collections as well as to items, the extensibility with respect to the inclusion of new descriptors and the support to several retrieval modes. The MetaMedia application development platform, based on the model, has been used to support the construction of a historic documentation collection where a common web interface provides collection administrators, metadata creators and visitors a multi-faceted view of the repository.
2007
Autores
Ribeiro, C; David, G; Calistru, C;
Publicação
Research and Advanced Technology for Digital Libraries, Proceedings
Abstract
The web is currently the information searching and browsing environment of choice for scholars and lay users alike. The goal of most cultural heritage applications is to interest a large audience, and therefore web interfaces are being developed even when part of their functionality is not offered to the general public. We present a web-based interface for managing, browsing and searching a repository of historic documents. The documents pertain to a region which has been an important regional power in medieval times and their originals are under the custody of the Portuguese national archives. The challenges of the project came from its requisites in three aspects: rigorous archival description, the incorporation of document analysis and a flexible search interface. The system is an instance of a multimedia database framework providing both browse and retrieval functionalities to end users and configuration and content management services to the collection administrators.
2009
Autores
Calistru, C; Ribeiro, C; David, G;
Publicação
EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING
Abstract
Cultural heritage documents are often subject to digitization processes resulting in image material, even for textual contents. It is therefore common, in collections of valuable documents, to have descriptive information generated by the institutions, along with digitized images, transcriptions created by scholars, translations and even miscellaneous annotations. To offer a faceted access to the collection it is necessary to explore these diverse materials, integrate them according to a model that accounts for both metadata and the content and provide a comprehensive retrieval environment. In this work we have applied the MetaMedia multimedia database framework to a collection of ancient documents, processed the documents in their descriptive, textual, and image content and produced a browsing and searching system. The main challenges are the integrated management of metadata and content, the indexing of the image content, and the design of the browsing and searching interface where various views on the data are kept together. Copyright (C) 2009 Catalin Calistru et al.
2010
Autores
Nunes, S; Ribeiro, C; David, G;
Publicação
DOCENG2010: PROCEEDINGS OF THE 2010 ACM SYMPOSIUM ON DOCUMENT ENGINEERING
Abstract
Documents on the World Wide Web are dynamic entities. Mainstream information retrieval systems and techniques are primarily focused on the latest version a document, generally ignoring its evolution over time. In this work, we study the term frequency dynamics in web documents over their lifespan. We use the Wikipedia as a document collection because it is a broad and public resource and, more important, because it provides access to the complete revision history of each document. We investigate the progression of similarity values over two projection variables, namely revision order and revision date. Based on this investigation we find that term frequency in encyclopedic documents - i.e. comprehensive and focused on a single topic - exhibits a rapid and steady progression towards the document's current version. The content in early versions quickly becomes very similar to the present version of the document.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.