Details
Name
Aníbal FerreiraRole
Senior ResearcherSince
22nd November 1995
Nationality
PortugalCentre
Telecommunications and MultimediaContacts
+351222094299
anibal.ferreira@inesctec.pt
2024
Authors
Oliveira, M; Santos, V; Saraiva, A; Ferreira, A;
Publication
Abstract
2024
Authors
Oliveira, M; Santos, V; Saraiva, A; Ferreira, A;
Publication
SIGNALS
Abstract
Many natural signals exhibit quasi-periodic behaviors and are conveniently modeled as combinations of several harmonic sinusoids whose relative frequencies, magnitudes, and phases vary with time. The waveform shapes of those signals reflect important physical phenomena underlying their generation, requiring those parameters to be accurately estimated and modeled. In the literature, accurate phase estimation and modeling have received significantly less attention than frequency or magnitude estimation. This paper first addresses accurate DFT-based phase estimation of individual sinusoids across six scenarios involving two DFT-based filter banks and three different windows. It has been shown that bias in phase estimation is less than 0.001 radians when the SNR is equal to or larger than 2.5 dB. Using the Cram & eacute;r-Rao lower bound as a reference, it has been demonstrated that one particular window offers performance of practical interest by better approximating the CRLB under favorable signal conditions and minimizing performance deviation under adverse conditions. This paper describes the development of a shift-invariant phase-related feature that characterizes the harmonic phase structure. This feature motivates a new signal processing paradigm that greatly simplifies the parametric modeling, transformation, and synthesis of harmonic signals. It also aids in understanding and reverse engineering the phasegram. The theory and results are discussed from a reproducible perspective, with dedicated experiments supported by code, allowing for the replication of figures and results presented in this paper and facilitating further research.
2023
Authors
Jesus, LMT; Castilho, S; Ferreira, A; Costa, MC;
Publication
JOURNAL OF PHONETICS
Abstract
Purpose: The acoustic signal attributes of whispered speech potentially carry sufficiently distinct information to define vowel spaces and to disambiguate consonant place and voicing, but what these attributes are and the underlying production mechanisms are not fully known. The purpose of this study was to define segmental cues to place and voicing of vowels and sibilant fricatives and to develop an articulatory interpretation of acoustic data.Method: Seventeen speakers produced sustained sibilants and oral vowels, disyllabic words, sentences and read a phonetically balanced text. All the tasks were repeated in voiced and whispered speech, and the sound source and filter analysed using the following parameters: Fundamental frequency, spectral peak frequencies and levels, spectral slopes, sound pressure level and durations. Logistic linear mixed-effects models were developed to understand what acoustic signal attributes carry sufficiently distinct information to disambiguate /i, a/ and /s, ?/.Results: Vowels were produced with significantly different spectral slope, sound pressure level, first and second formant frequencies in voiced and whispered speech. The low frequencies spectral slope of voiced sibilants was significantly different between whispered and voiced speech. The odds of choosing /a/ instead of /i/ were esti-mated to be lower for whispered speech when compared to voiced speech. Fricatives' broad peak frequency was statistically significant when discriminating between /s/ and /?/.Conclusions: First formant frequency and relative duration of vowels are consistently used as height cues, and spectral slope and broad peak frequency are attributes associated with consonantal place of articulation. The rel-ative duration of same-place voiceless fricatives was higher than voiced fricatives both in voiced and whispered speech. The evidence presented in this paper can be used to restore voiced speech signals, and to inform reha-bilitation strategies that can safely explore the production mechanisms of whispering.CO 2023 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY license (http:// creativecommons.org/licenses/by/4.0/).
2023
Authors
Silva, JM; Oliveira, MA; Saraiva, AF; Ferreira, AJS;
Publication
ACOUSTICS
Abstract
The estimation of the frequency of sinusoids has been the object of intense research for more than 40 years. Its importance in classical fields such as telecommunications, instrumentation, and medicine has been extended to numerous specific signal processing applications involving, for example, speech, audio, and music processing. In many cases, these applications run in real-time and, thus, require accurate, fast, and low-complexity algorithms. Taking the normalized Cramer-Rao lower bound as a reference, this paper evaluates the relative performance of nine non-iterative discrete Fourier transform-based individual sinusoid frequency estimators when the target sinusoid is affected by full-bandwidth quasi-harmonic interference, in addition to stationary noise. Three levels of the quasi-harmonic interference severity are considered: no harmonic interference, mild harmonic interference, and strong harmonic interference. Moreover, the harmonic interference is amplitude-modulated and frequency-modulated reflecting real-world conditions, e.g., in singing and musical chords. Results are presented for when the Signal-to-Noise Ratio varies between -10 dB and 70 dB, and they reveal that the relative performance of different frequency estimators depends on the SNR and on the selectivity and leakage of the window that is used, but also changes drastically as a function of the severity of the quasi-harmonic interference. In particular, when this interference is strong, the performance curves of the majority of the tested frequency estimators collapse to a few trends around and above 0.4% of the DFT bin width.
2023
Authors
Jesus, LMT; Ferreira, JFS; Ferreira, AJS;
Publication
JASA EXPRESS LETTERS
Abstract
The temporal distribution of acoustic cues in whispered speech was analyzed using the gating paradigm. Fifteen Portuguese participants listened to real disyllabic words produced by four Portuguese speakers. Lexical choices, confidence scores, isolation points (IPs), and recognition points (RPs) were analyzed. Mixed effects models predicted that the first syllable and 70% of the total duration of the second syllable were needed for lexical choices to be above chance level. Fricatives' place, not voicing, had a significant effect on the percentage of correctly identified words. IP and RP values of words with postalveolar voiced and voiceless fricatives were significantly different.
Supervised Thesis
2023
Author
Gonçalo Duarte Nunes
Institution
UP-FEUP
2023
Author
Gonçalo Duarte Nunes
Institution
UP-FEUP
2023
Author
Gonçalo Duarte Nunes
Institution
UP-FEUP
2023
Author
Gonçalo Duarte Nunes
Institution
UP-FEUP
2023
Author
Nélio David de Freitas Gonçalves
Institution
UP-FEUP
The access to the final selection minute is only available to applicants.
Please check the confirmation e-mail of your application to obtain the access code.