Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Publicações

Publicações por HumanISE

2015

Effects of terminology on health queries: An analysis by user's health literacy and topic familiarity

Autores
Lopes, CT; Ribeiro, C;

Publicação
Advances in Librarianship

Abstract
Prior studies have shown that terminology support can improve health information retrieval but have not taken into account the characteristics of the user performing the search. In this chapter, the impact of translating queries' terms between lay and medico-scientific terminology, in users with different levels of health literacy and topic familiarity, is evaluated. Findings demonstrate that medico-scientific queries demand more from the users and are mostly aimed at health professionals. In addition, these queries retrieve documents that are less readable and less well understood by users. Despite this, medico-scientific queries are associated with higher precision in the top-10 retrieved documents results and tend slightly to generate knowledge with less incorrect contents, the researchers concluded that search engines should provide query suggestions with medico-scientific terminology, whenever the user is able to digest it, that is, in users above the lowest levels of health literacy and topic familiarity. On the other hand, retrieval systems should provide lay alternative queries in users with inadequate health literacy or in those unfamiliar with a topic. In fact, the quantity of incorrect contents in the knowledge that emerges from a medico-scientific session tends to decrease with topic familiarity and health literacy. In terms of topic familiarity, the opposite happens with Graded Average Precision. Moreover, users most familiar with a topic tend to have higher motivational relevance with medico-scientific queries than with lay queries. This work is the first to consider user context features while studying the impact of a query processing technique in several aspects of the retrieval process, including the medical accuracy of the acquired knowledge. © 2015 by Emerald Group Publishing Limited.

2015

Motivators and Deterrents for Data Description and Publication: Preliminary Results

Autores
Ribeiro, C; da Silva, JR; Castro, JA; Amorim, RC; Fortuna, P;

Publicação
ON THE MOVE TO MEANINGFUL INTERNET SYSTEMS: OTM 2015 WORKSHOPS

Abstract
In the recent trend of data-intensive science, data publication is essential and institutions have to promote it with the researchers. For the past decade, institutional repositories have been widely established for publications, and the motivations for deposit are well established. The situation is quite different for data, as we argue on the basis of a 5-year experience with research data management at the University of Porto. We address research data management from a disciplined yet flexible point of view, focusing on domain-specific metadata models embedded in intuitive tools, to make it easier for researchers to publish their datasets. We use preliminary data from a recent experiment in data publishing to identify motivators and deterrents for data publishing.

2015

Ontologies for Research Data Description: A Design Process Applied to Vehicle Simulation

Autores
Castro, JA; Perrotta, D; Amorim, RC; da Silva, JR; Ribeiro, C;

Publicação
METADATA AND SEMANTICS RESEARCH, MTSR 2015

Abstract
Data description is an essential part of research data management, and it is easy to argue for the importance of describing data early in the research workflow. Specific metadata schemas are often proposed to support description. Given the diversity of research domains, such schemas are often missing, and when available they may be too generic, too complex or hard to incorporate in a description platform. In this paper we present a method used to design metadata models for research data description as ontologies. Ontologies are gaining acceptance as knowledge representation structures, and we use them here in the scope of the Dendro platform. The ontology design process is illustrated with a case study from Vehicle Simulation. According to the design process, the resulting model was validated by a domain specialist.

2015

Summarization of changes in dynamic text collections using Latent Dirichlet Allocation model

Autores
Kar, M; Nunes, S; Ribeiro, C;

Publicação
INFORMATION PROCESSING & MANAGEMENT

Abstract
In the area of Information Retrieval, the task of automatic text summarization usually assumes a static underlying collection of documents, disregarding the temporal dimension of each document. However, in real world settings, collections and individual documents rarely stay unchanged over time. The World Wide Web is a prime example of a collection where information changes both frequently and significantly over time, with documents being added, modified or just deleted at different times. In this context, previous work addressing the summarization of web documents has simply discarded the dynamic nature of the web, considering only the latest published version of each individual document. This paper proposes and addresses a new challenge - the automatic summarization of changes in dynamic text collections. In standard text summarization, retrieval techniques present a summary to the user by capturing the major points expressed in the most recent version of an entire document in a condensed form. In this new task, the goal is to obtain a summary that describes the most significant changes made to a document during a given period. In other words, the idea is to have a summary of the revisions made to a document over a specific period of time. This paper proposes different approaches to generate summaries using extractive summarization techniques. First, individual terms are scored and then this information is used to rank and select sentences to produce the final summary. A system based on Latent Dirichlet Allocation model (LDA) is used to find the hidden topic structures of changes. The purpose of using the LDA model is to identify separate topics where the changed terms from each topic are likely to carry at least one significant change. The different approaches are then compared with the previous work in this area. A collection of articles from Wikipedia, including their revision history, is used to evaluate the proposed system. For each article, a temporal interval and a reference summary from the article's content are selected manually. The articles and intervals in which a significant event occurred are carefully selected. The summaries produced by each of the approaches are evaluated comparatively to the manual summaries using ROUGE metrics. It is observed that the approach using the LDA model outperforms all the other approaches. Statistical tests reveal that the differences in ROUGE scores for the LDA-based approach is statistically significant at 99% over baseline.

2015

Engaging Researchers in Data Management with LabTablet, an Electronic Laboratory Notebook

Autores
Amorim, RC; Castro, JA; da Silva, JR; Ribeiro, C;

Publicação
LANGUAGES, APPLICATIONS AND TECHNOLOGIES, SLATE 2015

Abstract
Dealing with research data management can be a complex task, and recent guidelines prompt researchers to actively participate in this activity. Emergent research data platforms are proposing workflows to motivate researchers to take an active role in the management of their data. Other tools, such as electronic laboratory notebooks, can be embedded in the laboratory environment to ease the collection of valuable data and metadata as soon as it is available. This paper reports an extension of the previously developed LabTablet application to gather data and metadata for different research domains. Along with this extension, we present a case study from the social sciences, concerning the identification of the data description requirements for one of its domains. We argue that the LabTablet can be crucial to engage researchers in data organization and description. After starting the process, researchers can then manage their data in Dendro, a staging platform with stronger, collaborative management capabilities, which allows them to export their annotated datasets to selected research data repositories.

2015

The influence of documents, users and tasks on the relevance and comprehension of health web documents

Autores
Oroszlanyova, M; Ribeiro, C; Nunes, S; Lopes, CT;

Publicação
CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS/INTERNATIONAL CONFERENCE ON PROJECT MANAGEMENT/CONFERENCE ON HEALTH AND SOCIAL CARE INFORMATION SYSTEMS AND TECHNOLOGIES, CENTERIS/PROJMAN / HCIST 2015

Abstract
Search engines typically estimate relevance using features of the documents. We believe that several features from the user and task can also contribute to this process. In the health domain there are specific characteristics of web documents that can also add value to this estimation. In the present work, using a dataset composed by set of annotated web pages and their assessment by a set of users regarding their relevance and comprehension, we analyse what characteristics affect documents' relevance and what characteristics influence how well users comprehend them. We have conducted a bivariate analysis using characteristics of the above data collection. The strongest relations we have found are linked to the task features, suggesting a direct association between tasks' clarity and easiness and both the relevance and the comprehension of the content. The language of the document, its medical certification, the update status, the content in pathology definitions, the content in prevention, prognosis and treatment information, are other characteristics valued by consumers in terms of relevance. Users' previous experience on health searches and, particularly, on the topic being searched, their gender, the language and terminology of their queries were shown to be related to their success in the search tasks. We have also found that lay terminology, knowledge about the medico-scientific terms and the language of the documents are good indicators of comprehension. Documents containing links and testimonies, and the ones recently updated were observed to be better understood by users, as well as blog posts and comments. (C) 2015 The Authors. Published by Elsevier B.V.

  • 348
  • 589