2025
Autores
Mendes, AS; Murciego, AL; Silva, LA; Jiménez-Bravo, DM; Navarro-Cáceres, M; Bernardes, G;
Publicação
PROGRESS IN ARTIFICIAL INTELLIGENCE, EPIA 2024, PT I
Abstract
Monodic folk music has traditionally been preserved in physical documents. It constitutes a vast archive that needs to be digitized to facilitate comprehensive analysis using AI techniques. A critical component of music score digitization is the transcription of lyrics, an extensively researched process in Optical Character Recognition (OCR) and document layout analysis. These fields typically require the development of specific models that operate in several stages: first, to detect the bounding boxes of specific texts, then to identify the language, and finally, to recognize the characters. Recent advances in vision language models (VLMs) have introduced multimodal capabilities, such as processing images and text, which are competitive with traditional OCR methods. This paper proposes an end-to-end system for extracting lyrics from images of handwritten musical scores. We aim to evaluate the performance of two state-of-the-art VLMs to determine whether they can eliminate the need to develop specialized text recognition and OCR models for this task. The results of the study, obtained from a dataset in a real-world application environment, are presented along with promising new research directions in the field. This progress contributes to preserving cultural heritage and opens up new possibilities for global analysis and research in folk music.
2025
Autores
Cao, Z; Pinto, AS; Bernardes, G;
Publicação
International Conference on Computer Supported Education, CSEDU - Proceedings
Abstract
Sound design plays an important role in serious games, influencing user experience and learning outcomes. However, deriving general principles and best practices remains challenging, as most literature relies on case-based studies in different application domains. Through a systematic review of the literature, 21 studies were analyzed to address two key questions: 1) what types of serious games and application domains incorporate sound design? and 2) what sound design strategies are implemented to enhance user experience and learning outcomes? The findings show that serious games have mainly focused on education, healthcare, and training, using sound to enhance motivation (50%), cognition (32%), and knowledge acquisition (18%). Furthermore, sound design strategies fulfill distinct roles: sound effects enhance feedback and engagement, background music influences motivation and cognitive processing, ambient sounds support navigation and emotional regulation, and dialogue facilitates knowledge acquisition. The findings highlight the need for further research to establish standardized sound design principles to optimize user experience and learning outcomes in serious games. Copyright © 2025 by SCITEPRESS - Science and Technology Publications, Lda.
2025
Autores
Rodrigues Ferraz Esteves, AR; Campos Magalhães, EM; Bernardes De Almeida, G;
Publicação
SAE Technical Papers
Abstract
Silent motors are an excellent strategy to combat noise pollution. Still, they can pose risks for pedestrians who rely on auditory cues for safety and reduce driver awareness due to the absence of the familiar sounds of combustion engines. Sound design for silent motors not only tackles the above issues but goes beyond safety standards towards a user-centered approach by considering how users perceive and interpret sounds. This paper examines the evolving field of sound design for electric vehicles (EVs), focusing on Acoustic Vehicle Alerting Systems (AVAS). The study analyzes existing AVAS, classifying them into different groups according to their design characteristics, from technical concerns and approaches to aesthetic properties. Based on the proposed classification, an (adaptive) sound design methodology, and concept for AVAS are proposed based on state-of-the-art technologies and tools (APIs), like Wwise Automotive, and integration through a functional prototype within a virtual environment. We validate our solution by conducting user tests focusing on EV sound perception and preferences in rural and urban environments. Results showed participants preferred nature-like and melodic sounds with a wide range of frequencies, emphasizing 1000Hz, in rural areas, for the AVAS. For the interior experience, melodic, reliable, and relaxing sounds with a frequency range from 200Hz to 500Hz. In urban areas, melodic, futuristic, but not overpowering sounds (80Hz to 700Hz) with balanced frequencies at high speeds were chosen for the car's exterior. In the interior, melodic, futuristic, and combustion engine-like sounds with a low frequencies background and higher frequencies at high speeds were also preferred. © 2025 SAE International. All Rights Reserved.
2025
Autores
Braga, F; Bernardes, G; Dannenberg, RB; Correia, N;
Publicação
Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence
Abstract
2025
Autores
Forero Rodriguez, J; Bernardes, G;
Publicação
Proceedings of the 12th International Conference on Digital Libraries for Musicology
Abstract
2025
Autores
Khatri, N; Bernardes, G;
Publicação
Proceedings of the 12th International Conference on Digital Libraries for Musicology
Abstract
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.