Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
About

About

Gilberto Bernardes holds a Ph.D. in Digital Media (2014) by the Universidade do Porto under the auspices of the University of Texas at Austin and a Master of Music 'cum Lauda' (2008) by Amsterdamse Hogeschool voor de Kunsten. Bernardes is currently an Assistant Professor at the Universidade do Porto and a Senior Researcher at the INESC TEC where he leads the Sound and Music Computing Lab. He counts with more than 90 publications, of which 14 are articles in peer-reviewed journals with a high impact factor (mostly Q1 and Q2 in Scimago) and fourteen chapters in books. Bernardes interacted with 152 international collaborators in co-authoring scientific papers. Bernardes has been continuously contributing to the training of junior scientists, as he is currently supervising six Ph.D. thesis and concluded 40+ Master dissertations.


He received nine awards, including the Fraunhofer Portugal Prize for the best Ph.D. thesis and several best paper awards at conferences (e.g., DCE and CMMR). He has participated in 12 R&D projects as a senior and junior researcher. In the past eight years, following his PhD defense, Bernardes was able to attract competitive funding to conduct a post-doctoral project funded by FCT and an exploratory grant for a market-based R&D prototype. Currently, he is leading the Portuguese team (Work Package leader) at INESC TEC on the Horizon Europe project EU-DIGIFOLK, and the Erasmus+ project Open Minds. His latest contribution focuses on cognitive-inspired tonal music representations and sound synthesis In his artistic activities, Bernardes has performed in some distinguished music venues such as Bimhuis, Concertgebouw, Casa da Música, Berklee College of Music, New York University, and Seoul Computer Music Festival.

Interest
Topics
Details

Details

  • Name

    Gilberto Bernardes Almeida
  • Role

    Senior Researcher
  • Since

    14th July 2014
005
Publications

2025

Evaluation of Lyrics Extraction from Folk Music Sheets Using Vision Language Models (VLMs)

Authors
Sales Mendes, A; Lozano Murciego, Á; Silva, LA; Jiménez Bravo, M; Navarro Cáceres, M; Bernardes, G;

Publication
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Abstract
Monodic folk music has traditionally been preserved in physical documents. It constitutes a vast archive that needs to be digitized to facilitate comprehensive analysis using AI techniques. A critical component of music score digitization is the transcription of lyrics, an extensively researched process in Optical Character Recognition (OCR) and document layout analysis. These fields typically require the development of specific models that operate in several stages: first, to detect the bounding boxes of specific texts, then to identify the language, and finally, to recognize the characters. Recent advances in vision language models (VLMs) have introduced multimodal capabilities, such as processing images and text, which are competitive with traditional OCR methods. This paper proposes an end-to-end system for extracting lyrics from images of handwritten musical scores. We aim to evaluate the performance of two state-of-the-art VLMs to determine whether they can eliminate the need to develop specialized text recognition and OCR models for this task. The results of the study, obtained from a dataset in a real-world application environment, are presented along with promising new research directions in the field. This progress contributes to preserving cultural heritage and opens up new possibilities for global analysis and research in folk music. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.

2024

Acting Emotions: a comprehensive dataset of elicited emotions

Authors
Aly, L; Godinho, L; Bota, P; Bernardes, G; da Silva, HP;

Publication
SCIENTIFIC DATA

Abstract
Emotions encompass physiological systems that can be assessed through biosignals like electromyography and electrocardiography. Prior investigations in emotion recognition have primarily focused on general population samples, overlooking the specific context of theatre actors who possess exceptional abilities in conveying emotions to an audience, namely acting emotions. We conducted a study involving 11 professional actors to collect physiological data for acting emotions to investigate the correlation between biosignals and emotion expression. Our contribution is the DECEiVeR (DatasEt aCting Emotions Valence aRousal) dataset, a comprehensive collection of various physiological recordings meticulously curated to facilitate the recognition of a set of five emotions. Moreover, we conduct a preliminary analysis on modeling the recognition of acting emotions from raw, low- and mid-level temporal and spectral data and the reliability of physiological data across time. Our dataset aims to leverage a deeper understanding of the intricate interplay between biosignals and emotional expression. It provides valuable insights into acting emotion recognition and affective computing by exposing the degree to which biosignals capture emotions elicited from inner stimuli.

2024

Exploring Mode Identification in Irish Folk Music with Unsupervised Machine Learning and Template-Based Techniques

Authors
Navarro-Cáceres, JJ; Carvalho, N; Bernardes, G; Jiménez-Bravo, DM; Navarro-Cáceres, M;

Publication
MATHEMATICS AND COMPUTATION IN MUSIC, MCM 2024

Abstract
Extensive computational research has been dedicated to detecting keys and modes in tonal Western music within the major and minor modes. Little research has been dedicated to other modes and musical expressions, such as folk or non-Western music. This paper tackles this limitation by comparing traditional template-based with unsupervised machine-learning methods for diatonic mode detection within folk music. Template-based methods are grounded in music theory and cognition and use predefined profiles from which we compare a musical piece. Unsupervised machine learning autonomously discovers patterns embedded in the data. As a case study, the authors apply the methods to a dataset of Irish folk music called The Session on four diatonic modes: Ionian, Dorian, Mixolydian, and Aeolian. Our evaluation assesses the performance of template-based and unsupervised methods, reaching an average accuracy of about 80%. We discuss the applicability of the methods, namely the potential of unsupervised learning to process unknown musical sources beyond modes with predefined templates.

2024

Fourier Qualia Wavescapes: Hierarchical Analyses of Set Class Quality and Ambiguity

Authors
Pereira, S; Affatato, G; Bernardes, G; Moss, FC;

Publication
MATHEMATICS AND COMPUTATION IN MUSIC, MCM 2024

Abstract
We introduce a novel perspective on set-class analysis combining the DFT magnitudes with the music visualisation technique of wavescapes. With such a combination, we create a visual representation of a piece's multidimensional qualia, where different colours indicate saliency in chromaticity, diadicity, triadicity, octatonicity, diatonicity, and whole-tone quality. At the centre of our methods are: 1) the formal definition of the Fourier Qualia Space (FQS), 2) its particular ordering of DFT coefficients that delineate regions linked to different musical aesthetics, and 3) the mapping of such regions into a coloured wavescape. Furthermore, we demonstrate the intrinsic capability of the FQS to express qualia ambiguity and map it into a synopsis wavescape. Finally, we showcase the application of our methods by presenting a few analytical remarks on Bach's Three-part Invention BWV 795, Debussy's Reflets dans l'eau, andWebern's Four Pieces for Violin and Piano, Op. 7, No. 1, unveiling increasingly ambiguous wavescapes.

2024

Fourier (Common-Tone) Phase Spaces are in Tune with Variational Autoencoders' Latent Space

Authors
Carvalho, N; Bernardes, G;

Publication
MATHEMATICS AND COMPUTATION IN MUSIC, MCM 2024

Abstract
Expanding upon the potential of generative machine learning to create atemporal latent space representations of musical-theoretical and cognitive interest, we delve into their explainability by formulating and testing hypotheses on their alignment with DFT phase spaces from {0, 1}(12) pitch classes and {0, 1}(128) pitch distributions - capturing common-tone tonal functional harmony and parsimonious voice-leading principles, respectively. We use 371 J.S. Bach chorales as a benchmark to train a Variational Autoencoder on a representative piano roll encoding. The Spearman rank correlation between the latent space and the two before-mentioned DFT phase spaces exhibits a robust rank association of approximately .65 +/- .05 for pitch classes and .61 +/- .05 for pitch distributions, denoting an effective preservation of harmonic functional clusters per region and parsimonious voice-leading. Furthermore, our analysis prompts essential inquiries about the stylistic characteristics inferred from the rank deviations to the DFT phase space and the balance between the two DFT phase spaces.

Supervised
thesis

2023

Sound Designing Brands and Establishing Sonic Identities: the Sons Em Trânsito Music Agency

Author
João Pedro Melo Albino de Sá Cardielos

Institution
UP-FEUP

2023

An interactive and digital puppeteering interface for new musical expression (IDPI)

Author
Hibiki Mukai

Institution
UP-FEUP

2023

Synthesizing Soundscapes from Textual Input: Development and Comparison of Generative AI Models

Author
Márcio Cláudio Silva Duarte

Institution
UP-FEUP

2023

AVE - Assessing Ambiguity in Speech-based Affective Virtual Environments

Author
Jorge Federico Forero Rodríguez

Institution
UP-FEUP

2023

AVALIANDO PREFERÊNCIAS MUSICAIS DE CRIANÇAS NO ESPECTRO AUTISTA: IMPLICAÇÕES PARA A TERAPIA

Author
Natália Isabel dos Santos

Institution
UP-FEUP