Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Publicações

Publicações por HumanISE

2023

Research Image Management Practices Reported by Scientific Literature: An Analysis by Research Domain

Autores
Rodrigues J.; Lopes C.T.;

Publicação
Open Information Science

Abstract
Research data management is essential for safeguarding and prospecting data generated in a scientific context. Specific issues arise regarding data in image format, as this data typology poses particular challenges and opportunities; however, not much attention has been given to data as images. We reviewed 109 articles from several research domains where images were used either as data or metadata to understand how researchers specifically deal with this data format, and what are your habits and behaviors. We use the Web of Science (WoS), considering its five main areas of research. We included in the initial corpus the most relevant articles by research domain, selecting the ten most cited articles in WoS, by year, between 2010 and 2021. The selected articles should be in English and in open access. The results found that images have been used in scientific works numerous times, but, unfortunately, few are those in which they are the central element of the study. Photography is the type of image most used in most domains. In terms of the instruments used, the Technology and Life Sciences and Biomedicine domains use the microscope more, while the Arts and Humanities and Physical Sciences domains use the camera more. We found that the images are mostly produced in the context of the project, rather than reused by third parties. As for their collection scenario, these are mostly produced/used in a laboratory context. The overwhelming majority of the images present in the articles are digital, and only a small part is analog. We verify that Arts and Humanities are more likely to perform qualitative types of analyses, while Life Sciences and Biomedicine overwhelmingly use quantitative analyses. As for the issues of sharing and depositing, Life Sciences and Biomedicine is the domain that stands out the most in the tasks of depositing and sharing images. It was found that the licenses of a project are intrinsically related to the motivations for sharing results with third parties. Description, a fundamental step in the data management process, is neglected by a large number of researchers. The images are mostly not described or annotated and when this happens, researchers don't provide much detail about this.

2023

The Evolution of Web Search User Interfaces - An Archaeological Analysis of Google Search Engine Result Pages

Autores
Oliveira, B; Lopes, CT;

Publicação
Proceedings of the 2023 Conference on Human Information Interaction and Retrieval, CHIIR 2023, Austin, TX, USA, March 19-23, 2023

Abstract
Web search engines have marked everyone's life by transforming how one searches and accesses information. Search engines give special attention to the user interface, especially search engine result pages (SERP). The well-known "10 blue links"list has evolved into richer interfaces, often personalized to the search query, the user, and other aspects. More than 20 years later, the literature has not adequately portrayed this development. We present a study on the evolution of SERP interfaces during the last two decades using Google Search as a case study. We used the most searched queries by year to extract a sample of SERP from the Internet Archive. Using this dataset, we analyzed how SERP evolved in content, layout, design (e.g., color scheme, text styling, graphics), navigation, and file size. We have also analyzed the user interface design patterns associated with SERP elements. We found that SERP are becoming more diverse in terms of elements, aggregating content from different verticals and including more features that provide direct answers. This systematic analysis portrays evolution trends in search engine user interfaces and, more generally, web design. We expect this work will trigger other, more specific studies that can take advantage of our dataset.

2023

From 10 Blue Links Pages to Feature-Full Search Engine Results Pages - Analysis of the Temporal Evolution of SERP Features

Autores
Oliveira, B; Lopes, CT;

Publicação
Proceedings of the 2023 Conference on Human Information Interaction and Retrieval, CHIIR 2023, Austin, TX, USA, March 19-23, 2023

Abstract
Web Search Engine Results Pages (SERP) are one of the most well-known and used web pages. These pages have started as simple "10 blue links"pages, but the information in SERP currently goes way beyond these links. Several features have been included in these pages to complement organic and sponsored results and attempt to provide answers to the query instead of just pointing to websites that might deliver that information. In this work, we analyze the appearance and evolution of SERP features in the two leading web search engines, Google Search and Microsoft Bing. Using a sample of SERP from the Internet Archive, we analyzed the appearance and evolution of these features. We found that SERP are becoming more diverse in terms of elements, aggregating content from different verticals and including more features that provide direct answers.

2023

Images as Metadata: A New Perspective for Describing Research Data

Autores
Rodrigues, J; Teixeira Lopes, C;

Publicação
Journal of Library Metadata

Abstract
Indispensable in many contexts, images are fundamental in the tasks of representation and transmission of information. In the scientific context, images can be tools for researchers seeking to see their data properly managed. Research data management guides in this direction as it determines necessary phases in the life cycle of projects. The description phase is fundamental as it is an essential means for data context, safeguarding, and reuse. The description often occurs through metadata models composed of descriptors capable of attributing context. However, there is one common aspect: the values associated with these descriptors are always textual or numeric. Through studies and work developed over the last few years, we propose a new approach to description, where images can have a preponderant role in the description of data, assuming the role of metadata. We present several pieces of evidence, point out their challenges and determine the opportunities this new perspective can have in the research. Images have specific characteristics that can be leveraged in improving data description. Historical evidence establish that images have always been used and produced in research, yet their representational ability has never been harnessed to describe data and give more context to the scientific process. ©, Joana Rodrigues and Carla Teixeira Lopes. Published with license by Taylor & Francis Group, LLC.

2023

Optimization of Image Processing Algorithms for Character Recognition in Cultural Typewritten Documents

Autores
Dias, M; Lopes, CT;

Publicação
ACM JOURNAL ON COMPUTING AND CULTURAL HERITAGE

Abstract
Linked data is used in various fields as a new way of structuring and connecting data. Cultural heritage institutions have been using linked data to improve archival descriptions and facilitate the discovery of information. Most archival records have digital representations of physical artifacts in the form of scanned images that are non-machine-readable. Optical Character Recognition (OCR) recognizes text in images and translates it into machine-encoded text. This article evaluates the impact of image processing methods and parameter tuning in OCR applied to typewritten cultural heritage documents. The approach uses a multi-objective problem formulation to minimize Levenshtein edit distance and maximize the number of words correctly identified with a non-dominated sorting genetic algorithm (NSGA-II) to tune the methods' parameters. Evaluation results show that parameterization by digital representation typology benefits the performance of image pre-processing algorithms in OCR. Furthermore, our findings suggest that employing image pre-processing algorithms in OCR might be more suitable for typologies where the text recognition task without pre-processing does not produce good results. In particular, Adaptive Thresholding, Bilateral Filter, and Opening are the best-performing algorithms for the theater plays' covers, letters, and overall dataset, respectively, and should be applied before OCR to improve its performance.

2023

Unveiling Archive Users: Understanding Their Characteristics and Motivations

Autores
Ponte, L; Koch, I; Lopes, CT;

Publicação
LEVERAGING GENERATIVE INTELLIGENCE IN DIGITAL LIBRARIES: TOWARDS HUMAN-MACHINE COLLABORATION, ICADL 2023, PT II

Abstract
An institution must understand its users to provide quality services, and archives are no exception. Over the years, archives have adapted to the technological world, and their users have also changed. To understand archive users' characteristics and motivations, we conducted a study in the context of the Portuguese Archives. For this purpose, we analysed a survey and complemented this analysis with information gathered in interviews with archivists. Based on the most frequent reasons for visiting the archives, we defined six main archival profiles (genealogical research, historical research, legal purposes, academic work, institutional purposes and publication purposes), later characterised using the results of the previous analysis. For each profile, we created a persona for a more visual and realistic representation of users.

  • 35
  • 587