Cookies
O website necessita de alguns cookies e outros recursos semelhantes para funcionar. Caso o permita, o INESC TEC irá utilizar cookies para recolher dados sobre as suas visitas, contribuindo, assim, para estatísticas agregadas que permitem melhorar o nosso serviço. Ver mais
Aceitar Rejeitar
  • Menu
Tópicos
de interesse
Detalhes

Detalhes

  • Nome

    Nádia Sousa Carvalho
  • Cargo

    Assistente de Investigação
  • Desde

    01 outubro 2021
001
Publicações

2025

Exploring timbre latent spaces: motion-enhanced sampling for musical co-improvisation

Autores
Carvalho, N; Sousa, J; Portovedo, H; Bernardes, G;

Publicação
INTERNATIONAL JOURNAL OF PERFORMANCE ARTS AND DIGITAL MEDIA

Abstract
This article investigates sampling strategies in latent space navigation to enhance co-creative music systems, focusing on timbre latent spaces. Adopting Villa-Rojo's 'Lamento' for tenor saxophone and tape as a case study, we conducted two experiments. The first assessed traditional corpus-based concatenative synthesis sampling within the RAVE model's latent space, finding that sampling strategies gradually deviate from a given target sonority while still relating to the original morphology. The second experiment aims at defining sampling strategies for creating variations of an input signal, namely parallel, contrary, and oblique motions. The findings expose the need to explore individual model layers and the geometric transformation nature of the contrary and oblique motions that tend to dilate the original shape. The findings highlight the potential of motion-aware sampling for more contextually aware and expressive control of music structures via CBCS.

2025

Motiv: A Dataset of Latent Space Representations of Musical Phrase Motions

Autores
Carvalho, N; Sousa, J; Bernardes, G; Portovedo, H;

Publicação
Proceedings of the 20th International Audio Mostly Conference

Abstract
This paper introduces Motiv, a dataset of expert saxophonist recordings illustrating parallel, similar, oblique, and contrary motions. These motions are variations of three phrases from Jesús Villa-Rojo's "Lamento,"with controlled similarities. The dataset includes 116 audio samples recorded by four tenor saxophonists, each annotated with descriptions of motions, musical scores, and latent space vectors generated using the VocalSet RAVE model. Motiv enables the analysis of motion types and their geometric relationships in latent spaces. Our preliminary dataset analysis shows that parallel motions align closely with original phrases, while contrary motions exhibit the largest deviations, and oblique motions show mixed patterns. The dataset also highlights the impact of individual performer nuances. Motiv supports a variety of music information retrieval (MIR) tasks, including gesture-based recognition, performance analysis, and motion-driven retrieval. It also provides insights into the relationship between human motion and music, contributing to real-time music interaction and automated performance systems. © 2025 Copyright held by the owner/author(s).

2024

Exploring Mode Identification in Irish Folk Music with Unsupervised Machine Learning and Template-Based Techniques

Autores
Navarro-Cáceres, JJ; Carvalho, N; Bernardes, G; Jiménez-Bravo, DM; Navarro-Cáceres, M;

Publicação
MATHEMATICS AND COMPUTATION IN MUSIC, MCM 2024

Abstract
Extensive computational research has been dedicated to detecting keys and modes in tonal Western music within the major and minor modes. Little research has been dedicated to other modes and musical expressions, such as folk or non-Western music. This paper tackles this limitation by comparing traditional template-based with unsupervised machine-learning methods for diatonic mode detection within folk music. Template-based methods are grounded in music theory and cognition and use predefined profiles from which we compare a musical piece. Unsupervised machine learning autonomously discovers patterns embedded in the data. As a case study, the authors apply the methods to a dataset of Irish folk music called The Session on four diatonic modes: Ionian, Dorian, Mixolydian, and Aeolian. Our evaluation assesses the performance of template-based and unsupervised methods, reaching an average accuracy of about 80%. We discuss the applicability of the methods, namely the potential of unsupervised learning to process unknown musical sources beyond modes with predefined templates.

2024

Fourier (Common-Tone) Phase Spaces are in Tune with Variational Autoencoders' Latent Space

Autores
Carvalho, N; Bernardes, G;

Publicação
MATHEMATICS AND COMPUTATION IN MUSIC, MCM 2024

Abstract
Expanding upon the potential of generative machine learning to create atemporal latent space representations of musical-theoretical and cognitive interest, we delve into their explainability by formulating and testing hypotheses on their alignment with DFT phase spaces from {0, 1}(12) pitch classes and {0, 1}(128) pitch distributions - capturing common-tone tonal functional harmony and parsimonious voice-leading principles, respectively. We use 371 J.S. Bach chorales as a benchmark to train a Variational Autoencoder on a representative piano roll encoding. The Spearman rank correlation between the latent space and the two before-mentioned DFT phase spaces exhibits a robust rank association of approximately .65 +/- .05 for pitch classes and .61 +/- .05 for pitch distributions, denoting an effective preservation of harmonic functional clusters per region and parsimonious voice-leading. Furthermore, our analysis prompts essential inquiries about the stylistic characteristics inferred from the rank deviations to the DFT phase space and the balance between the two DFT phase spaces.

2024

Modal Pitch Space: A Computational Model of Melodic Pitch Attraction in Folk Music

Autores
Bernardes, G; Carvalho, N;

Publicação
MATHEMATICS AND COMPUTATION IN MUSIC, MCM 2024

Abstract
We introduce a computational model that quantifies melodic pitch attraction in diatonic modal folk music, extending Lerdahl's Tonal Pitch Space. The model incorporates four melodic pitch indicators: vertical embedding distance, horizontal step distance, semitone interval distance, and relative stability. Its scalability is exclusively achieved through prior mode and tonic information, eliminating the need in existing models for additional chordal context. Noteworthy contributions encompass the incorporation of empirically-driven folk music knowledge and the calculation of indicator weights. Empirical evaluation, spanning Dutch, Irish, and Spanish folk traditions across Ionian, Dorian, Mixolydian, and Aeolian modes, uncovers a robust linear relationship between melodic pitch transitions and the pitch attraction model infused with empirically-derived knowledge. Indicator weights demonstrate cross-tradition generalizability, highlighting the significance of vertical embedding distance and relative stability. In contrast, semitone and horizontal step distances assume residual and null functions, respectively.