Cookies Policy
The website need some cookies and similar means to function. If you permit us, we will use those means to collect data on your visits for aggregated statistics to improve our service. Find out More
Accept Reject
  • Menu
Publications

Publications by CTM

2007

Semantic Modeling of Digital Multimedia

Authors
Akhgar, B; Siddiqi, JIA; Rahman, F; Shah, N; Korda, N; Attias, R; Benamou, N; Andrade, MT; Dori, YJ; Hashavia, B;

Publication
Proceedings of the 2007 International Conference on Multimedia Systems and Applications, MSA 2007, June 25-28, 2007, Las Vegas Nevada, USA

Abstract

2007

A unified data model and system support for the context-aware access to multimedia content

Authors
Carvalho, P; Andrade, MT; Alberti, C; Castro, H; Calistru, C; Cuetos, Pd;

Publication
Datenbanksysteme in Business, Technologie und Web (BTW 2007), Workshop Proceedings, 5.-6. März 2007, Aachen, Germany

Abstract

2007

Generation of hardware modules for run-time reconfigurable hybrid CPU/FPGA systems

Authors
Silva, ML; Ferreira, JC;

Publication
IET COMPUTERS AND DIGITAL TECHNIQUES

Abstract
A tool called BITLINKER, that creates partially reconfigurable modules from the bit-streams of individual components is described. It is also capable of performing restricted component placement and interconnect routing between the assembled components. The resulting modules are used in applications that exploit partial dynamic reconfiguration. The tool is integrated in a design flow particularly aimed at dynamically reconfigurable platform field-programmable gate arrays (FPGAs). The associated development design flow and a run-time support system that can be used to manage module activation and data communication are described. Evaluation results obtained with a Virtex-II Pro system are also reported.

2007

New enhancements to Immersive Sound field Rendition (ISR) system

Authors
Dubey, C; Annadana, R; Sinha, D; Ferreira, A;

Publication
Audio Engineering Society - 122nd Audio Engineering Society Convention 2007

Abstract
Consumer audio applications such as satellite radio broadcasts, multi-channel audio streaming and playback systems coupled with the need to meet stringent bandwidth requirements are eliciting newer challenges in parametric multichannel audio coding schemes. This paper describes the continuation of our research concerning the Immersive Soundfield Rendition (ISR) system and the different enhancements in various algorithmic components. The need to maintain a constant bit rate for many applications requires a rate control mechanism. The various strategies utilized in the rate control mechanism are presented. In addition, an innovative phase compensated down-mixing scheme has been incorporated in the ISR system so as to generate a high quality carrier signal. Enhancements have been made to the blind up-mixing scheme and to considerable gains have been made in terms of acoustic diversity. The various enhancements of the ISR system and its performance are detailed. Audio demonstrations are available at http://www.atc-labs.com/isr.

2007

Static features in real-time recognition of isolated vowels at high pitch

Authors
Ferreira, AJS;

Publication
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA

Abstract
This paper addresses the problem of automatic identification of vowels uttered in isolation by female and child speakers. In this case, the magnitude spectrum of voiced vowels is sparsely sampled since only frequencies at integer multiples of F0 are significant. This impacts negatively on the performance of vowel identification techniques that either ignore pitch or rely on global shape models. A new pitch-dependent approach to vowel identification is proposed that emerges from the concept of timbre and that defines perceptual spectral clusters (PSC) of harmonic partials. A representative set of static PSC-related features are estimated and their performance is evaluated in automatic classification tests using the Mahalanobis distance. Linear prediction features and Mel-frequency cepstral coefficients (MFCC) coefficients are used as a reference and a database of five (Portuguese) natural vowel sounds uttered by 44 speakers (including 27 child speakers) is used for training and testing the Gaussian models. Results indicate that perceptual spectral cluster (PSC) features perform better than plain linear prediction features, but perform slightly worse than MFCC features. However, PSC features have the potential to take full advantage of the pitch structure of voiced vowels, namely in the analysis of concurrent voices, or by using pitch as a normalization parameter. (C) 2007 Acoustical Society of America.

2007

A novel automatic noise removal technique for audio and speech signals

Authors
Harinarayanan, EV; Sinha, D; Saeed, S; Ferreira, A;

Publication
Audio Engineering Society - 123rd Audio Engineering Society Convention 2007

Abstract
This paper introduces new ideas on wideband stationary/non-stationary noise removal for audio signals. Current noise reduction techniques have generally proven to be effective, yet these typically exhibit certain undesirable characteristics. Distortion and/or alteration of the audio characteristics of primary audio sound is a common problem. Also user intervention in identifying the noise profile is sometimes necessary. The proposed technique is centered on the classical Kalman filtering technique for noise removal but uses a novel architecture whereby advanced signal processing techniques are used to identify and preserve the richness of the audio spectrum. The paper also includes conceptual and derivative results on parameter estimation, a description of multi parameter Signal Activity Detector (SAD) and our new found improved results.

  • 277
  • 325