2023
Authors
Forero, J; Bernardes, G; Mendes, M;
Publication
Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST
Abstract
Language is closely related to how we perceive ourselves and signify our reality. In this scope, we created Desiring Machines, an interactive media art project that allows the experience of affective virtual environments adopting speech emotion recognition as the leading input source. Participants can share their emotions by speaking, singing, reciting poetry, or making any vocal sounds to generate virtual environments on the run. Our contribution combines two machine learning models. We propose a long-short term memory and a convolutional neural network to predict four main emotional categories from high-level semantic and low-level paralinguistic acoustic features. Predicted emotions are mapped to audiovisual representations by an end-to-end process encoding emotion in virtual environments. We use a generative model of chord progressions to transfer speech emotion into music based on the tonal interval space. Also, we implement a generative adversarial network to synthesize an image from the transcribed speech-to-text. The generated visuals are used as the style image in the style-transfer process onto an equirectangular projection of a spherical panorama selected for each emotional category. The result is an immersive virtual space encapsulating emotions in spheres disposed into a 3D environment. Users can create new affective representations or interact with other previously encoded instances (This ArtsIT publication is an extended version of the earlier abstract presented at the ACM MM22 [1]). © 2023, ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering.
2023
Authors
Clemente, MP; Mendes, J; Bernardes, G; Van Twillert, H; Ferreira, AP; Amarante, JM;
Publication
JOURNAL OF INTERNATIONAL MEDICAL RESEARCH
Abstract
This paper presents a clinical case study investigating the pattern of a saxophonist's embouchure as a possible origin of orofacial pain. The rehabilitation addressed the dental occlusion and a fracture in a metal ceramic bridge. To evaluate the undesirable loads on the upper teeth, two piezoresistive sensors were placed between the central incisors and the mouthpiece during the embouchure. A newly fixed metal ceramic prosthesis was placed from teeth 13 to 25, and two implants were placed in the premolar zone corresponding to teeth 14 and 15. After the oral rehabilitation, the embouchure force measurements showed that higher stability was promoted by the newly fixed metal-ceramic prosthesis. The musician executed a more symmetric loading of the central incisors (teeth 11 and 21). The functional demands of the saxophone player and consequent application of excessive pressure can significantly influence and modify the metal-ceramic position on the anterior zone teeth 21/22. The contribution of engineering (i.e., monitoring the applied forces on the musician's dental structures) was therefore crucial for the correct assessment and design of the treatment plan.
2023
Authors
Bernardes, G; Carvalho, N; Pereira, S;
Publication
JOURNAL OF NEW MUSIC RESEARCH
Abstract
FluidHarmony is an algorithmic method for defining a hierarchical harmonic lexicon in equal temperaments. It utilizes an enharmonic weighted Fourier transform space to represent pitch class set (pcsets) relations. The method ranks pcsets based on user-defined constraints: the importance of interval classes (ICs) and a reference pcset. Evaluation of 5,184 Western musical pieces from the 16th to 20th centuries shows FluidHarmony captures 8% of the corpus's harmony in its top pcsets. This highlights the role of ICs and a reference pcset in regulating harmony in Western tonal music while enabling systematic approaches to define hierarchies and establish metrics beyond 12-TET.
2023
Authors
Lopes, A; Barboza, JR; Bernardes, G;
Publication
2023 Immersive and 3D Audio: from Architecture to Automotive, I3DA 2023
Abstract
Immersive audio technologies have broadened postproduction strategies for spatial audio, gaining popularity among mainstream audiences. However, there is a lack of defined procedures and critical thinking regarding audio mixing guidelines for surround sound in popular music. In this context, we conducted an empirical study to identify trends concerning instrument position, trajectories, and dynamics from surround mixings. Furthermore, we assess the degree to which they differ from their stereo renderings. Seven award-winning songs in the Grammy category for Best Immersive Album were analyzed, including surround 5.1 and stereo versions. The study found consistent instrument positions in the songs, with rhythmic instruments and bass in the center, lead vocals spread across front channels, and harmonic instruments in wider positions. Solo instruments occupied left, right, and center channels, with dynamics emphasizing lead vocals and solos. Trajectories were rarely used, indicating channel-based thinking. Limited adoption of immersive audio dimensions and reliance on stereo techniques were observed, with no notable differences between the surround and stereo versions. Identified song outliers are discussed and offer avenues for exploration, highlighting the importance of diverse musical expressions in informing immersive audio mixing. © 2023 IEEE.
2023
Authors
Cao, Z; Magalhães, E; Bernardes, G;
Publication
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Abstract
We study the impact of sound design – soundscape, sound effects, and auditory notifications, namely earcons – on the player’s experience of serious games. Three sound design versions for the game Venci’s Adventures have been developed: 1) no sound; 2) standard sound design, including soundscapes and sound effects; and 3) standard sound design with auditory notification (namely, earcons). Perceptual experiments were conducted to evaluate the most suitable attention retention earcons from a diverse collection of timbres, pitch, and melodic patterns, as well as the user experience of the different sound design versions assessed in pairs (1 vs. 2 and 2 vs. 3). Our results show that participants (n= 23 ) perceive better user experience in terms of game playing competence, immersion, flow, challenge and affect, and enhanced attention retention when adopting standard sound design with the earcons. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.
2023
Authors
Carvalho, N; Diogo, D; Bernardes, G;
Publication
THE 10TH INTERNATIONAL CONFERENCE ON DIGITAL LIBRARIES FOR MUSICOLOGY, DLFM 2023
Abstract
We propose a method for computing the similarity of symbolically-encoded Portuguese folk melodies. The main novelty of our method is the use of a preprocessing melodic reduction at multiple hierarchies to filter the surface of folk melodies according to 1) pitch stability, 2) interval salience, 3) beat strength, 4) durational accents, and 5) the linear combination of all former criteria. Based on the salience of each note event per criteria, we create three melodic reductions with three different levels of note retention. We assess the degree to which six folk music similarity measures at multiple reduction hierarchies comply with collected ground truth from experts in Portuguese folk music. The results show that SIAM combined with 75th quantile reduction using the combined or durational accents best models the similarity for a corpus of Portuguese folk melodies by capturing approximately 84-90% of the variance observed in ground truth annotations.
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.