Publication Type : Conference Proceedings
Publisher : TENCON
Source : TENCON 2019 - 2019 IEEE Region 10 Conference (TENCON) (2019)
Url : https://ieeexplore.ieee.org/document/8929432
Campus : Bengaluru
School : School of Engineering
Department : Electronics and Communication
Year : 2019
Abstract : Vocal emotion conversion forms a significant post-processing step in text-to-speech synthesis. The existing models in emotion conversion are data intensive and computationally expensive. This paper puts forth a hybrid emotion conversion framework in Telugu using enhanced global variance GMM for spectral conversion. Fundamental frequency conversion is achieved using multi-layer feed-forward ANN for higher-dimensional CWT decomposed F0. The method performs better than baseline least squares GMM in perceptive quality for all 3 emotions considered viz. anger, fear and happiness. Strength of constrained variance (CV)-GMM is utilized for improving quality of synthesized emotional speech. Listening tests indicated that quality is considerably improved with maximum Comparative Mean Opinion Score (CMOS) of 4.42 (anger) in Telugu dataset.
Cite this Research Publication : Susmitha Vekkot and Gupta, D., “Emotion Conversion in Telugu using Constrained Variance GMM and Continuous Wavelet Transform-$F_0$”, TENCON 2019 - 2019 IEEE Region 10 Conference (TENCON). 2019.