Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Publicações

Publicações por CTM

2025

Evaluation of Lyrics Extraction from Folk Music Sheets Using Vision Language Models (VLMs)

Autores
Mendes, AS; Murciego, AL; Silva, LA; Jiménez-Bravo, DM; Navarro-Cáceres, M; Bernardes, G;

Publicação
PROGRESS IN ARTIFICIAL INTELLIGENCE, EPIA 2024, PT I

Abstract
Monodic folk music has traditionally been preserved in physical documents. It constitutes a vast archive that needs to be digitized to facilitate comprehensive analysis using AI techniques. A critical component of music score digitization is the transcription of lyrics, an extensively researched process in Optical Character Recognition (OCR) and document layout analysis. These fields typically require the development of specific models that operate in several stages: first, to detect the bounding boxes of specific texts, then to identify the language, and finally, to recognize the characters. Recent advances in vision language models (VLMs) have introduced multimodal capabilities, such as processing images and text, which are competitive with traditional OCR methods. This paper proposes an end-to-end system for extracting lyrics from images of handwritten musical scores. We aim to evaluate the performance of two state-of-the-art VLMs to determine whether they can eliminate the need to develop specialized text recognition and OCR models for this task. The results of the study, obtained from a dataset in a real-world application environment, are presented along with promising new research directions in the field. This progress contributes to preserving cultural heritage and opens up new possibilities for global analysis and research in folk music.

2025

Exploring the Role of Sound Design in Serious Games: Impact on User Experience and Learning Outcomes

Autores
Cao, Z; Pinto, AS; Bernardes, G;

Publicação
International Conference on Computer Supported Education, CSEDU - Proceedings

Abstract
Sound design plays an important role in serious games, influencing user experience and learning outcomes. However, deriving general principles and best practices remains challenging, as most literature relies on case-based studies in different application domains. Through a systematic review of the literature, 21 studies were analyzed to address two key questions: 1) what types of serious games and application domains incorporate sound design? and 2) what sound design strategies are implemented to enhance user experience and learning outcomes? The findings show that serious games have mainly focused on education, healthcare, and training, using sound to enhance motivation (50%), cognition (32%), and knowledge acquisition (18%). Furthermore, sound design strategies fulfill distinct roles: sound effects enhance feedback and engagement, background music influences motivation and cognitive processing, ambient sounds support navigation and emotional regulation, and dialogue facilitates knowledge acquisition. The findings highlight the need for further research to establish standardized sound design principles to optimize user experience and learning outcomes in serious games. Copyright © 2025 by SCITEPRESS - Science and Technology Publications, Lda.

2025

Sound Design for Electric Vehicles: Enhancing Safety and User Experience Through Acoustic Vehicle Alerting System (AVAS)

Autores
Rodrigues Ferraz Esteves, AR; Campos Magalhães, EM; Bernardes De Almeida, G;

Publicação
SAE Technical Papers

Abstract
Silent motors are an excellent strategy to combat noise pollution. Still, they can pose risks for pedestrians who rely on auditory cues for safety and reduce driver awareness due to the absence of the familiar sounds of combustion engines. Sound design for silent motors not only tackles the above issues but goes beyond safety standards towards a user-centered approach by considering how users perceive and interpret sounds. This paper examines the evolving field of sound design for electric vehicles (EVs), focusing on Acoustic Vehicle Alerting Systems (AVAS). The study analyzes existing AVAS, classifying them into different groups according to their design characteristics, from technical concerns and approaches to aesthetic properties. Based on the proposed classification, an (adaptive) sound design methodology, and concept for AVAS are proposed based on state-of-the-art technologies and tools (APIs), like Wwise Automotive, and integration through a functional prototype within a virtual environment. We validate our solution by conducting user tests focusing on EV sound perception and preferences in rural and urban environments. Results showed participants preferred nature-like and melodic sounds with a wide range of frequencies, emphasizing 1000Hz, in rural areas, for the AVAS. For the interior experience, melodic, reliable, and relaxing sounds with a frequency range from 200Hz to 500Hz. In urban areas, melodic, futuristic, but not overpowering sounds (80Hz to 700Hz) with balanced frequencies at high speeds were chosen for the car's exterior. In the interior, melodic, futuristic, and combustion engine-like sounds with a low frequencies background and higher frequencies at high speeds were also preferred. © 2025 SAE International. All Rights Reserved.

2025

Algorithmic Composition Using Narrative Structure and Tension

Autores
Braga, F; Bernardes, G; Dannenberg, RB; Correia, N;

Publicação
Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence

Abstract
This paper describes an approach to algorithmic music composition that takes narrative structures as input, allowing composers to create music directly from narrative elements. Creating narrative development in music remains a challenging task in algorithmic composition. Our system addresses this by combining leitmotifs to represent characters, generative grammars for harmonic coherence, and evolutionary algorithms to align musical tension with narrative progression. The system operates at different scales, from overall plot structure to individual motifs, enabling both autonomous composition and co-creation with varying degrees of user control. Evaluation with compositions based on tales demonstrated the system's ability to compose music that supports narrative listening and aligns with its source narratives, while being perceived as familiar and enjoyable.

2025

Leveraging Large-language Models for Thematic Analysis of Children’s Folk Lyrics: A comparative study of Iberian Traditions

Autores
Forero Rodriguez, J; Bernardes, G;

Publicação
Proceedings of the 12th International Conference on Digital Libraries for Musicology

Abstract

2025

Performance Configuration Analysis in Portuguese Traditional Music: A Computational Approach

Autores
Khatri, N; Bernardes, G;

Publicação
Proceedings of the 12th International Conference on Digital Libraries for Musicology

Abstract

  • 14
  • 381