Back close

Age-Based Automatic Voice Conversion Using Blood Relation for Voice Impaired

Publication Type : Journal Article

Publisher : Computers, Materials & Continua, Vol.70, No.2, 2022, pp.4027-4051 (SCIE Journal, IF: 3.772 Citescore: 4.6 Q1: 80 percentile)

Source : CMC-COMPUTERS MATERIALS & CONTINUA

Url : https://www.researchgate.net/publication/355145538_Age-Based_Automatic_Voice_Conversion_Using_Blood_Relation_for_Voice_Impaired

Keywords : Blood relations; KFCG; LBG; MFCC; vector quantization; correlation; speech samples; same-gender; dissimilar gender; voice conversion; PSOLA; SVM

Campus : Bengaluru, Coimbatore

School : School of Engineering

Center : Center for Computational Engineering and Networking, Computational Engineering and Networking

Department : Electronics and Communication

Year : 2022

Abstract : The present work presents a statistical method to translate human voices across age groups, based on commonalities in voices of blood relations. The age-translated voices have been naturalized extracting the blood relation features e.g., pitch, duration, energy, using Mel Frequency Cepstrum Coefficients (MFCC), for social compatibility of the voice-impaired. The system has been demonstrated using standard English and an Indian language. The voice samples for resynthesis were derived from 12 families, with member ages ranging from 8-80 years. The voice-age translation, performed using the Pitch synchronous overlap and add (PSOLA) approach, by modulation of extracted voice features, was validated by perception test. The translated and resynthesized voices were correlated using Linde, Buzo, Gray (LBG), and Kekre's Fast Codebook generation (KFCG) algorithms. For translated voice targets, a strong (θ >∼93% and θ >∼96%) correlation was found with blood relatives, whereas, a weak (θ <∼78% and θ <∼80%) correlation range was found between different families and different gender from same families. The study further subcategorized the sampling and synthesis of the voices into similar or dissimilar gender groups, using a support vector machine (SVM) choosing between available voice samples. Finally, ∼96%, ∼93%, and ∼94% accuracies were obtained in the identification of the gender of the voice sample, the age group samples, and the correlation between the original and converted voice samples, respectively. The results obtained were close to the natural voice sample features and are envisaged to facilitate a near-natural voice for speech-impaired easily.

Cite this Research Publication : Palli Padmini, C. Paramasivam, G. Jyothish Lal, Sadeen Alharbi, and Kaustav Bhowmick, Age-Based Automatic Voice Conversion Using Blood Relation for Voice Impaired, Computers, Materials & Continua, Vol.70, No.2, 2022, pp.4027-4051 (SCIE Journal, IF: 3.772 Citescore: 4.6 Q1: 80 percentile)

Admissions Apply Now