Publication Type : Conference Paper
Publisher : Springer
Source : Springer-International Journal of Speech Technology, March 2024, (SCI, Impact factor: 2.7)
Url : https://link.springer.com/article/10.1007/s10772-024-10095-8
Campus : Chennai
School : School of Engineering
Department : Electronics and Communication
Year : 2024
Abstract : Speech Emotion Recognition (SER) is the process of recognizing and classifying emotions expressed through speech. SER greatly facilitates personalized and empathetic interactions, enhances user experiences, enables sentiment analysis, and finds applications in psychology, healthcare, entertainment, and gaming industries. However, accurately detecting and classifying emotions is a highly challenging task for machines due to the complexity and multifaceted nature of emotions. This work gives a comparative analysis of two approaches for emotion recognition based on original and augmented speech signals. The first approach involves extracting 39 Mel Frequency Cepstrum Coefficients (MFCC) features, while the second approach involves using MFCC spectrograms and extracting features using deep learning models such as MobileNet V2, VGG16, Inception V3, VGG19 and ResNet 50. These features are then tested on Machine learning classifiers such as SVM, Linear SVM, Naive Bayes, k-Nearest Neighbours, Logistic Regression and Random Forest. From the experiments, it is observed that the SVM classifier works best with all the feature extraction techniques Furthermore, to enhance the results, ensembling techniques involving CatBoost, and the Voting classifier along with SVM were utilized, resulting in improved test accuracies of 97.04% on the RAVDESS dataset, 93.24% on the SAVEE dataset, and 99.83% on the TESS dataset, respectively. It is worth noting that both approaches are computationally efficient as they required no training time.
Cite this Research Publication : Aishwarya N, Kanwaljeet Kaur and Karthik Seemakurthy, “A Computationally Efficient Speech Emotion Recognition System employing Machine learning Classifiers and Ensemble learning”, Springer-International Journal of Speech Technology, March 2024, (SCI, Impact factor: 2.7)