Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Publicações

Publicações por HumanISE

2023

Optimization of Image Processing Algorithms for Character Recognition in Cultural Typewritten Documents

Autores
Dias, M; Lopes, CT;

Publicação
ACM JOURNAL ON COMPUTING AND CULTURAL HERITAGE

Abstract
Linked data is used in various fields as a new way of structuring and connecting data. Cultural heritage institutions have been using linked data to improve archival descriptions and facilitate the discovery of information. Most archival records have digital representations of physical artifacts in the form of scanned images that are non-machine-readable. Optical Character Recognition (OCR) recognizes text in images and translates it into machine-encoded text. This article evaluates the impact of image processing methods and parameter tuning in OCR applied to typewritten cultural heritage documents. The approach uses a multi-objective problem formulation to minimize Levenshtein edit distance and maximize the number of words correctly identified with a non-dominated sorting genetic algorithm (NSGA-II) to tune the methods' parameters. Evaluation results show that parameterization by digital representation typology benefits the performance of image pre-processing algorithms in OCR. Furthermore, our findings suggest that employing image pre-processing algorithms in OCR might be more suitable for typologies where the text recognition task without pre-processing does not produce good results. In particular, Adaptive Thresholding, Bilateral Filter, and Opening are the best-performing algorithms for the theater plays' covers, letters, and overall dataset, respectively, and should be applied before OCR to improve its performance.

2023

Unveiling Archive Users: Understanding Their Characteristics and Motivations

Autores
Ponte, L; Koch, I; Lopes, CT;

Publicação
LEVERAGING GENERATIVE INTELLIGENCE IN DIGITAL LIBRARIES: TOWARDS HUMAN-MACHINE COLLABORATION, ICADL 2023, PT II

Abstract
An institution must understand its users to provide quality services, and archives are no exception. Over the years, archives have adapted to the technological world, and their users have also changed. To understand archive users' characteristics and motivations, we conducted a study in the context of the Portuguese Archives. For this purpose, we analysed a survey and complemented this analysis with information gathered in interviews with archivists. Based on the most frequent reasons for visiting the archives, we defined six main archival profiles (genealogical research, historical research, legal purposes, academic work, institutional purposes and publication purposes), later characterised using the results of the previous analysis. For each profile, we created a persona for a more visual and realistic representation of users.

2023

Linking Theory and Practice of Digital Libraries: 27th International Conference on Theory and Practice of Digital Libraries, TPDL 2023, Zadar, Croatia, September 26-29, 2023, Proceedings

Autores
Alonso, O; Cousijn, H; Silvello, G; Marrero, M; Lopes, CT; Marchesin, S;

Publicação
TPDL

Abstract

2023

Linking Theory and Practice of Digital Libraries

Autores
Alonso, O; Cousijn, H; Silvello, G; Marrero, M; Teixeira Lopes, C; Marchesin, S;

Publicação
Lecture Notes in Computer Science

Abstract

2023

Chatbots Scenarios for Education

Autores
Virkus, S; Mamede, HS; Ramos Rocio, VJ; Dickel, J; Zubikova, O; Butkiene, R; Vaiciukynas, E; Ceponiene, L; Gudoniene, D;

Publicação
Information and Software Technologies - 29th International Conference, ICIST 2023, Kaunas, Lithuania, October 12-14, 2023, Proceedings

Abstract
Educational chatbots are digital tools designed to assist learners in various educational settings. These chatbots use natural language processing (NLP) and machine learning algorithms to simulate human conversation and respond to user queries in a way that facilitates learning. They can be integrated into various educational platforms such as learning management systems, educational apps, and websites to provide learners with a personalized and interactive learning experience. Our paper discusses different scenarios for educational purposes and suggests in total four scenarios for educational needs.

2023

New resource-constrained project scheduling instances for testing (meta-)heuristic scheduling algorithms

Autores
Coelho, J; Vanhoucke, M;

Publicação
COMPUTERS & OPERATIONS RESEARCH

Abstract
The resource-constrained project scheduling problem (RCPSP) is a well-known scheduling problem that has attracted attention since several decades. Despite the rapid progress of exact and (meta-)heuristic procedures, the problem can still not be solved to optimality for many problem instances of relatively small size. Due to the known complexity, many researchers have proposed fast and efficient meta-heuristic solution procedures that can solve the problem to near optimality. Despite the excellent results obtained in the last decades, little is known why some heuristics perform better than others. However, if researchers better understood why some meta-heuristic procedures generate good solutions for some project instances while still falling short for others, this could lead to insights to improve these meta-heuristics, ultimately leading to stronger algorithms and better overall solution quality. In this study, a new hardness indicator is proposed to measure the difficulty of providing near-optimal solutions for meta-heuristic procedures. The new indicator is based on a new concept that uses the o-distance metric to describe the solution space of the problem instance, and relies on current knowledge for lower and upper bound calculations for problem instances from five known datasets in the literature. This new indicator, which will be called the o -D indicator, will be used not only to measure the hardness of existing project datasets, but also to generate a new benchmark dataset that can be used for future research purposes. The new dataset contains project instances with different values for the o -D indicator, and it will be shown that the value of the o-distance metric actually describes the difficulty of the project instances through two fast and efficient meta-heuristic procedures from the literature.

  • 66
  • 641