Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Publicações

Publicações por Carla Lopes

2018

Predicting the quality of health web documents using their characteristics

Autores
Oroszlanyova, M; Lopes, CT; Nunes, S; Ribeiro, C;

Publicação
ONLINE INFORMATION REVIEW

Abstract
Purpose The quality of consumer-oriented health information on the web has been defined and evaluated in several studies. Usually it is based on evaluation criteria identified by the researchers and, so far, there is no agreed standard for the quality indicators to use. Based on such indicators, tools have been developed to evaluate the quality of web information. The HONcode is one of such tools. The purpose of this paper is to investigate the influence of web document features on their quality, using HONcode as ground truth, with the aim of finding whether it is possible to predict the quality of a document using its characteristics. Design/methodology/approach The present work uses a set of health documents and analyzes how their characteristics (e.g. web domain, last update, type, mention of places of treatment and prevention strategies) are associated with their quality. Based on these features, statistical models are built which predict whether health-related web documents have certification-level quality. Multivariate analysis is performed, using classification to estimate the probability of a document having quality given its characteristics. This approach tells us which predictors are important. Three types of full and reduced logistic regression models are built and evaluated. The first one includes every feature, without any exclusion, the second one disregards the Utilization Review Accreditation Commission variable, due to it being a quality indicator, and the third one excludes the variables related to the HONcode principles, which might also be indicators of quality. The reduced models were built with the aim to see whether they reach similar results with a smaller number of features. Findings The prediction models have high accuracy, even without including the characteristics of Health on the Net code principles in the models. The most informative prediction model considers characteristics that can be assessed automatically (e.g. split content, type, process of revision and place of treatment). It has an accuracy of 89 percent. Originality/value This paper proposes models that automatically predict whether a document has quality or not. Some of the used features (e.g. prevention, prognosis or treatment) have not yet been explicitly considered in this context. The findings of the present study may be used by search engines to promote high-quality documents. This will improve health information retrieval and may contribute to reduce the problems caused by inaccurate information.

2019

A classification scheme for analyses of messages exchanged in online health forums

Autores
Lopes, CT; Da Silva, BG;

Publicação
INFORMATION RESEARCH-AN INTERNATIONAL ELECTRONIC JOURNAL

Abstract
Introduction. Online health forums help to surface and organize patients' knowledge and make it useful for many. They are used by many to seek for advice or to share what they know about health subjects. Being an important communication medium, it's important to understand why and how it is used. Method. In this work we examine and categorize messages of an online health forum, with the purpose of providing a classification scheme that can be used by the research community in future analyses. The definition of the classification scheme was iterative and its inter-rater reliability was assessed twice using Cohen's Kappa statistic. Analysis. The classification scheme arose from a content analysis of 3,399 messages from several communities of an online health forum. Findings. The scheme is divided into four sections of categories and each section has several subcategories, in total there are 23 subcategories. The inter-rater agreement assessment of the scheme showed a good consistency between coders. The majority of the categories has a Cohen's Kappa agreement above 0.4. Conclusion. The proposed classification scheme facilitates the analysis of messages exchanged in online health forums for several purposes, including studies of information seeking.

2019

Interplay of Documents' Readability, Comprehension and Consumer Health Search Performance Across Query Terminology

Autores
Lopes, CT; Ribeiro, C;

Publicação
PROCEEDINGS OF THE 2019 CONFERENCE ON HUMAN INFORMATION INTERACTION AND RETRIEVAL (CHIIR'19)

Abstract
Because of terminology mismatches, health consumers frequently face difficulties while searching the Web for health information. Difficulties arise in query formulation but also in understanding the retrieved documents. In this work we analyze how documents' readability affects users' comprehension and how both affect the retrieval performance, measured in different ways. In addition, we analyze how performance measures relate with each other. For this purpose we have conducted a laboratory user study with 40 participants. We found that readability is essential for a document to be at least partially relevant and that it becomes even more important if the document has medico-scientific terminology. Moreover, the relevance of a document to a specific user highly depends on its comprehension. In lay queries we found the medical accuracy of users' answers is related to the session's relevance assessments. This shows that users can, at least in part, relate their relevance assessments with the medical accuracy of the documents. On the other hand, this relationship does not exist with medico-scientific queries.

2019

Assisting Health Consumers While Searching the Web Through Medical Annotations

Autores
Lopes, CT; Sousa, H;

Publicação
PROCEEDINGS OF THE 2019 CONFERENCE ON HUMAN INFORMATION INTERACTION AND RETRIEVAL (CHIIR'19)

Abstract
Health consumers usually face difficulties on their online searches, mainly because of the differences between terminologies used by laypeople and health professionals. This work presents a tool, HealthTranslator, available as a Google Chrome extension that intends to reduce this terminological gap while users are searching the Web for health information. HealthTranslator automatically annotates medical concepts in web documents, providing additional information, such as concept definition, related concepts and links to external references. The solution was evaluated regarding its: ( a) performance-the document processing is done gradually, typically from the top to the bottom of the document and performance was not an issue raised by the users; ( b) concept coverage-the solution was compared to a similar extension performing in English recognizing significantly more concepts. A comparison with a corpus of Portuguese documents manually annotated with medical concepts showed an average F-measure between 27% and 33%, depending on the type of concepts being recognized; ( c) users' receptivity to HealthTranslator and its usability-many aspects were surveyed on a user study. In general, the extension has a good acceptance and users find it useful.

2019

Characterizing and comparing Portuguese and English Wikipedia medicine-related articles

Autores
Domingues, G; Lopes, CT;

Publicação
COMPANION OF THE WORLD WIDE WEB CONFERENCE (WWW 2019 )

Abstract
Wikipedia is the largest on-line collaborative encyclopedia, containing information from a plethora of fields, including medicine. It has been shown that Wikipedia is one of the top visited sites by readers looking for information on this topic. The large reliance on Wikipedia for this type of information drives research towards the analysis of the quality of its articles. In this work, we evaluate and compare the quality of medicine-related articles in the English and Portuguese Wikipedia. For that we use metrics such as authority, completeness, complexity, informativeness, consistency, currency and volatility, and domain-specific measurements, in order to evaluate and compare the quality of medicine related articles in the English and Portuguese Wikipedia. We were able to conclude that the English articles score better across most metrics than the Portuguese articles.

2019

Graph-of-Entity: A Model for Combined Data Representation and Retrieval

Autores
Devezas, JL; Lopes, CT; Nunes, S;

Publicação
8th Symposium on Languages, Applications and Technologies, SLATE 2019, June 27-28, 2019, Coimbra, Portugal.

Abstract
Managing large volumes of digital documents along with the information they contain, or are associated with, can be challenging. As systems become more intelligent, it increasingly makes sense to power retrieval through all available data, where every lead makes it easier to reach relevant documents or entities. Modern search is heavily powered by structured knowledge, but users still query using keywords or, at the very best, telegraphic natural language. As search becomes increasingly dependent on the integration of text and knowledge, novel approaches for a unified representation of combined data present the opportunity to unlock new ranking strategies. We tackle entity-oriented search using graph-based approaches for representation and retrieval. In particular, we propose the graph-of-entity, a novel approach for indexing combined data, where terms, entities and their relations are jointly represented. We compare the graph-of-entity with the graph-of-word, a text-only model, verifying that, overall, it does not yet achieve a better performance, despite obtaining a higher precision. Our assessment was based on a small subset of the INEX 2009 Wikipedia Collection, created from a sample of 10 topics and respectively judged documents. The offline evaluation we do here is complementary to its counterpart from TREC 2017 OpenSearch track, where, during our participation, we had assessed graph-of-entity in an online setting, through team-draft interleaving. © José Devezas, Carla Lopes, and Sérgio Nunes.

  • 5
  • 14