Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Publicações

Publicações por Cristina Ribeiro

2007

Multimedia in cultural heritage collections: A model and applications

Autores
Ribeiro, C; David, G; Calistrul, C;

Publicação
ASIAN DIGITAL LIBRARIES: LOOKING BACK 10 YEARS AND FORGING NEW FRONTIERS, PROCEEDINGS

Abstract
The paper presents a multimedia database model accounting for the representation of documents, collections and the associated metadata. Appropriate structures are provided for descriptive metadata and for metadata resulting from automatic content analysis. The model is based on the identification and unification of the main concepts in the archival standards and the audiovisual area. The main features of the model, designed to support multimedia database applications, axe the integration of descriptive and content analysis metadata, the association of metadata to collections as well as to items, the extensibility with respect to the inclusion of new descriptors and the support to several retrieval modes. The MetaMedia application development platform, based on the model, has been used to support the construction of a historic documentation collection where a common web interface provides collection administrators, metadata creators and visitors a multi-faceted view of the repository.

2007

A historic documentation repository for specialized and public access

Autores
Ribeiro, C; David, G; Calistru, C;

Publicação
Research and Advanced Technology for Digital Libraries, Proceedings

Abstract
The web is currently the information searching and browsing environment of choice for scholars and lay users alike. The goal of most cultural heritage applications is to interest a large audience, and therefore web interfaces are being developed even when part of their functionality is not offered to the general public. We present a web-based interface for managing, browsing and searching a repository of historic documents. The documents pertain to a region which has been an important regional power in medieval times and their originals are under the custody of the Portuguese national archives. The challenges of the project came from its requisites in three aspects: rigorous archival description, the incorporation of document analysis and a flexible search interface. The system is an instance of a multimedia database framework providing both browse and retrieval functionalities to end users and configuration and content management services to the collection administrators.

2009

Multimedia in Cultural Heritage Manuscripts: Integrating Description, Transcription, and Image Content

Autores
Calistru, C; Ribeiro, C; David, G;

Publicação
EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING

Abstract
Cultural heritage documents are often subject to digitization processes resulting in image material, even for textual contents. It is therefore common, in collections of valuable documents, to have descriptive information generated by the institutions, along with digitized images, transcriptions created by scholars, translations and even miscellaneous annotations. To offer a faceted access to the collection it is necessary to explore these diverse materials, integrate them according to a model that accounts for both metadata and the content and provide a comprehensive retrieval environment. In this work we have applied the MetaMedia multimedia database framework to a collection of ancient documents, processed the documents in their descriptive, textual, and image content and produced a browsing and searching system. The main challenges are the integrated management of metadata and content, the indexing of the image content, and the design of the browsing and searching interface where various views on the data are kept together. Copyright (C) 2009 Catalin Calistru et al.

2010

Term Frequency Dynamics in Collaborative Articles

Autores
Nunes, S; Ribeiro, C; David, G;

Publicação
DOCENG2010: PROCEEDINGS OF THE 2010 ACM SYMPOSIUM ON DOCUMENT ENGINEERING

Abstract
Documents on the World Wide Web are dynamic entities. Mainstream information retrieval systems and techniques are primarily focused on the latest version a document, generally ignoring its evolution over time. In this work, we study the term frequency dynamics in web documents over their lifespan. We use the Wikipedia as a document collection because it is a broad and public resource and, more important, because it provides access to the complete revision history of each document. We investigate the progression of similarity values over two projection variables, namely revision order and revision date. Based on this investigation we find that term frequency in encyclopedic documents - i.e. comprehensive and focused on a single topic - exhibits a rapid and steady progression towards the document's current version. The content in early versions quickly becomes very similar to the present version of the document.

2007

Using neighbors to date web documents

Autores
Nunes, S; Ribeiro, C; David, G;

Publicação
International Conference on Information and Knowledge Management, Proceedings

Abstract
Time has been successfully used as a feature in web information retrieval tasks. In this context, estimating a document's inception date or last update date is a necessary task. Classic approaches have used HTTP header fields to estimate a document's last update time. The main problem with this approach is that it is applicable to a small part of web documents. In this work, we evaluate an alternative strategy based on a document's neighborhood. Using a random sample containing 10,000 URLs from the Yahoo! Directory, we study each document's links and media assets to determine its age. If we only consider isolated documents, we are able to date 52% of them. Including the document's neighborhood, we are able to estimate the date of more than 86% of the same sample. Also, we find that estimates differ significantly according to the type of neighbors used. The most reliable estimates are based on the document's media assets, while the worst estimates are based on incoming links. These results are experimentally evaluated with a real world application using different datasets. Copyright 2007 ACM.

2007

An evaluation framework for multidimensional multimedia Descriptor indexing

Autores
Gonalves, B; Calistru, C; Ribeiro, C; David, G;

Publicação
2007 IEEE 23RD INTERNATIONAL CONFERENCE ON DATA ENGINEERING WORKSHOP, VOLS 1-2

Abstract
Automatic multimedia retrieval requires the use of complex features, which are typically captured by multidimensional descriptors. A basic operation in a multimedia retrieval system is similarity computation, making use of descriptor-dependant metrics. Many data structures have been proposed for managing the representation of multidimensional descriptors, each geared towards efficiency in some set of basic operations. The paper describes a framework for evaluating multidimensional descriptor indexing structures and reports a set of experiments with selected descriptors indexing methods. The extensibility of the framework is illustrated by incorporating a recently-proposed structure, the BitMatrix. Data sets and experiment conditions can be set up so as to provide results that can be used in the choice of appropriate indexing structures for a class of multimedia retrieval applications.

  • 19
  • 22