Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Interest
Topics
Details

Details

  • Name

    Aníbal Ferreira
  • Role

    Senior Researcher
  • Since

    22nd November 1995
Publications

2023

Discriminative segmental cues to vowel height and consonantal place and voicing in whispered speech

Authors
Jesus, LMT; Castilho, S; Ferreira, A; Costa, MC;

Publication
JOURNAL OF PHONETICS

Abstract
Purpose: The acoustic signal attributes of whispered speech potentially carry sufficiently distinct information to define vowel spaces and to disambiguate consonant place and voicing, but what these attributes are and the underlying production mechanisms are not fully known. The purpose of this study was to define segmental cues to place and voicing of vowels and sibilant fricatives and to develop an articulatory interpretation of acoustic data.Method: Seventeen speakers produced sustained sibilants and oral vowels, disyllabic words, sentences and read a phonetically balanced text. All the tasks were repeated in voiced and whispered speech, and the sound source and filter analysed using the following parameters: Fundamental frequency, spectral peak frequencies and levels, spectral slopes, sound pressure level and durations. Logistic linear mixed-effects models were developed to understand what acoustic signal attributes carry sufficiently distinct information to disambiguate /i, a/ and /s, ?/.Results: Vowels were produced with significantly different spectral slope, sound pressure level, first and second formant frequencies in voiced and whispered speech. The low frequencies spectral slope of voiced sibilants was significantly different between whispered and voiced speech. The odds of choosing /a/ instead of /i/ were esti-mated to be lower for whispered speech when compared to voiced speech. Fricatives' broad peak frequency was statistically significant when discriminating between /s/ and /?/.Conclusions: First formant frequency and relative duration of vowels are consistently used as height cues, and spectral slope and broad peak frequency are attributes associated with consonantal place of articulation. The rel-ative duration of same-place voiceless fricatives was higher than voiced fricatives both in voiced and whispered speech. The evidence presented in this paper can be used to restore voiced speech signals, and to inform reha-bilitation strategies that can safely explore the production mechanisms of whispering.CO 2023 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY license (http:// creativecommons.org/licenses/by/4.0/).

2023

One-Step Discrete Fourier Transform-Based Sinusoid Frequency Estimation under Full-Bandwidth Quasi-Harmonic Interference

Authors
Silva, JM; Oliveira, MA; Saraiva, AF; Ferreira, AJS;

Publication
ACOUSTICS

Abstract
The estimation of the frequency of sinusoids has been the object of intense research for more than 40 years. Its importance in classical fields such as telecommunications, instrumentation, and medicine has been extended to numerous specific signal processing applications involving, for example, speech, audio, and music processing. In many cases, these applications run in real-time and, thus, require accurate, fast, and low-complexity algorithms. Taking the normalized Cramer-Rao lower bound as a reference, this paper evaluates the relative performance of nine non-iterative discrete Fourier transform-based individual sinusoid frequency estimators when the target sinusoid is affected by full-bandwidth quasi-harmonic interference, in addition to stationary noise. Three levels of the quasi-harmonic interference severity are considered: no harmonic interference, mild harmonic interference, and strong harmonic interference. Moreover, the harmonic interference is amplitude-modulated and frequency-modulated reflecting real-world conditions, e.g., in singing and musical chords. Results are presented for when the Signal-to-Noise Ratio varies between -10 dB and 70 dB, and they reveal that the relative performance of different frequency estimators depends on the SNR and on the selectivity and leakage of the window that is used, but also changes drastically as a function of the severity of the quasi-harmonic interference. In particular, when this interference is strong, the performance curves of the majority of the tested frequency estimators collapse to a few trends around and above 0.4% of the DFT bin width.

2023

Identification of words in whispered speech: The role of cues to fricatives' place and voicing

Authors
Jesus, LMT; Ferreira, JFS; Ferreira, AJS;

Publication
JASA EXPRESS LETTERS

Abstract
The temporal distribution of acoustic cues in whispered speech was analyzed using the gating paradigm. Fifteen Portuguese participants listened to real disyllabic words produced by four Portuguese speakers. Lexical choices, confidence scores, isolation points (IPs), and recognition points (RPs) were analyzed. Mixed effects models predicted that the first syllable and 70% of the total duration of the second syllable were needed for lexical choices to be above chance level. Fricatives' place, not voicing, had a significant effect on the percentage of correctly identified words. IP and RP values of words with postalveolar voiced and voiceless fricatives were significantly different.

2023

Analysis and Re-Synthesis of Natural Cricket Sounds Assessing the Perceptual Relevance of Idiosyncratic Parameters

Authors
Oliveira, M; Almeida, V; Silva, J; Ferreira, A;

Publication
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Abstract
Cricket sounds are usually regarded as pleasant and, thus, can be used as suitable test signals in psychoacoustic experiments assessing the human listening acuity to specific temporal and spectral features. In addition, the simple structure of cricket sounds makes them prone to reverse engineering such that they can be analyzed and re-synthesized with desired alterations in their defining parameters. This paper describes cricket sounds from a parametric point of view, characterizes their main temporal and spectral features, namely jitter, shimmer and frequency sweeps, and explains a re-synthesis process generating modified natural cricket sounds. These are subsequently used in listening tests helping to shed light on the sound identification and discrimination capabilities of humans that are important, for example, in voice recognition. © 2023 IEEE.

2022

Simple and effective signal processing pinpointing subtle premature ventricular contractions inferred from increasing physical effort

Authors
Ferreira, AJS;

Publication
2022 13th International Symposium on Communication Systems, Networks and Digital Signal Processing, CSNDSP 2022

Abstract
Premature ventricular contractions (PVC), or extrasystoles, represent a type of cardiac arrhythmia that is common among the general population and, notably, among athletes or individuals who exercise frequently. PVC may be asymptomatic and not clinically relevant when their rate is low, up to around 0.5%, or may be symptomatic and clinically relevant when it is high, in the order of or above 10%. ECG analysis in association with a cardiac stress test is important to detect and characterize PVC and to diagnose the heart condition and operation. In this paper, we describe and test a simple signal processing approach that can be used to effectively pinpoint subtle PVC occurrences in various physical effort conditions. In this regard, we discuss i) three important conditions to be met such that PVC are categorized as benign, ii) the design and implementation of a cardiac stress test and ECG data collection, iii) the algorithm analyzing and extracting information from the detected PVC occurrences, and iv) we present and discuss the obtained results, and conclude on their significance. © 2022 IEEE.

Supervised
thesis

2023

Whispered speech segmentation based on Deep Learning

Author
Gonçalo Duarte Nunes

Institution
UP-FEUP

2023

Whispered speech segmentation based on Deep Learning

Author
Gonçalo Duarte Nunes

Institution
UP-FEUP

2023

Vozeamento sintético de voz disfónica através da síntese digital de estruturas harmónicas em tempo real

Author
Nélio David de Freitas Gonçalves

Institution
UP-FEUP

2023

Whispered speech segmentation based on Deep Learning

Author
Gonçalo Duarte Nunes

Institution
UP-FEUP

2023

Dysphonic to natural voice reconstruction based on adaptive phonetic segmentation and synthetic implantation

Author
João Miguel Pinto Pereira da Silva

Institution
UP-FEUP