Back close

Vocal emotion conversion using wsola and linear prediction

Publication Type : Conference Proceedings

Publisher : Lecture Notes in Computer Science

Source : International Conference on Speech and Computer (SPECOM 19, Springer Verlag, Volume 10458 LNAI, p.777-787 (2017)

Url : https://www.scopus.com/inward/record.uri?eid=2-s2.0-85029521841&doi=10.1007%2f978-3-319-66429-3_78&partnerID=40&md5=8c4a3584ee3abc6c8e3c320700f3d4ce

ISBN : 9783319664286

Campus : Bengaluru

School : School of Engineering

Department : Electrical and Electronics

Year : 2017

Abstract : The paper deals with speech emotion conversion using Waveform Similarity Overlap Add (WSOLA) and subsequent linear prediction analysis for spectral transformation. Duration modification is done by taking the ratio between segment durations of neutral and target speech. After performing modification using WSOLA, the duration modified source speech is time aligned with target and further subjected to linear prediction analysis to yield the LP coefficients. The target emotion is re-synthesised by using the prosody manipulated residual and LPCs from source. The waveform similarity property of WSOLA is exploited to give output with minimal distortion. The proposed algorithm is subjectively and objectively evaluated along with popular TD-PSOLA algorithm. The correlation between synthesised and real target shows an average improvement of 60% across all emotions with the proposed technique. © Springer International Publishing AG 2017.

Cite this Research Publication : Susmitha Vekkot and Tripathi, S., “Vocal Emotion Conversion using Wsola and Linear Prediction”, International Conference on Speech and Computer (SPECOM 19, vol. 10458 LNAI. Springer Verlag, pp. 777-787, 2017.

Admissions Apply Now