Publication Type : Conference Proceedings
Publisher : International Conference on Speech and Signal processing (ICSSP 2014)
Campus : Coimbatore
School : School of Engineering
Center : Computational Engineering and Networking
Year : 2014
Abstract : The objective of the present work is to demonstrate the need for dynamically incorporating segmental durations for emotion conversion. Emotion conversion is the task of converting speech in one emotion to another. Most of the existing techniques incorporate the static variations in the prosodic parameters according to target emotion to achieve emotion conversion. The present work analyzes the segmental duration of various phonemes in a large emotion speech corpus and demonstrate the dynamic variations in the duration of various phonetic segments across emotions. The CSTR emotional speech corpus having two emotions (Angry and Happy) other than neutral and with 400 utterances per emotion for one speaker is used as the database for experimental studies. The segmental duration of the phonemes are statistically obtained by the classification and regression tree (CART) modeling of each emotion in the database.