Publication Type : Conference Proceedings
Publisher : International Symposium on Signal Processing and Intelligent Recognition Systems
Campus : Coimbatore
School : School of Engineering
Center : Computational Engineering and Networking
Verified : No
Year : 2018
Abstract : Lombard effect (LE) is the phenomena in which a person tends to speak louder in the presence of loud noise, due to the obstruction of self-auditory feedback. The main objective of this work is to develop a dataset for the study of LE on speech parameters. The proposed dataset comprising of 230 utterances each from 10 speakers, consists of the simultaneous recording of speech and ElectroGlottoGram (EGG) of speech under LE as well as neutral speech recorded in a noise free condition. The speech under LE is recorded at 5 different levels (30 dB, 15 dB, 5 dB, 0 dB and −20 dB) of babble noise. The level of LE in the developed dataset is demonstrated by comparing (a) the source parameters, (b) speaker recognition rates and (c) epoch extraction performance. For the comparison of source parameters like pitch and Strength of Excitation (SoE), the neutral speech and speech under LE are compared. Based on the comparison, high pitch and low SoE are observed for the speech under LE. Also, lower recognition performance is observed when a Mel Frequency Cepstral Coefficient (MFCC) - Gaussian Mixture Model (GMM) based speaker recognition system built using the neutral speech, is tested with the speech under LE obtained from the same set of speakers. Finally, on the basis of the comparison of epoch extraction from neutral speech and speech under LE, the utterances with LE is observed to have higher epoch deviation than that for neutral speech. All these experiments confirm the level of LE in the prepared database and also reinforces the issues in processing the speech under LE, for different speech processing tasks.