Towards Robust Speech Recognition Model Using Deep Learning

Publication Type : Conference Paper

Publisher : IEEE

Source : International Conference on Intelligent Systems for Communication, IoT and Security (ICISCoIS), Coimbatore, India, 2023, pp. 253-256, doi: 10.1109/ICISCoIS56541.2023.10100390. IEEE Xplore

Url : https://ieeexplore.ieee.org/document/10100390

Campus : Coimbatore

School : School of Computing

Year : 2023

Abstract : Automatic speech recognition (ASR) has expanded into more contexts recently due to the prevalence of smart gadgets. In a noisy setting, visual speech recognition, often known as lip reading, can be an important component of automatic speech recognition (ASR). Because it operates in silence, Visual Speech Recognition (VSR) is an integral part of Audio Visual Speech Recognition Systems (AVSR). VSR systems are used in places where there is a lot of background noise, while driving a car or using a cell phone. The VSR system utilizes the tried-and-true methods of statistics and machine learning, as well as the cutting-edge technique of deep learning. By employing an encoder-decoder attention based method, the proposed visual voice recognition system reduces the word error rate (WER) to around 2.8% on the benchmark GRID corpus and 40.1% on LRS2 corpus. Liptype and lipnet, two SOTA methods, are used to evaluate the outcomes.

Cite this Research Publication : A. Kumar, D. K. Renuka and M. C. S. Priya, "Towards Robust Speech Recognition Model Using Deep Learning," 2023 International Conference on Intelligent Systems for Communication, IoT and Security (ICISCoIS), Coimbatore, India, 2023, pp. 253-256, doi: 10.1109/ICISCoIS56541.2023.10100390. IEEE Xplore

About Amrita Vishwa Vidyapeetham

Rankings

Accreditation

Governance

Chancellor

Leadership

Press Media

Newsletters

Amritapuri
Campus

Amaravati
Campus

Bengaluru
Campus

Chennai
Campus

Coimbatore
Campus

Faridabad
Campus

Kochi
Campus

Mysuru
Campus

Nagercoil
Campus

Research

Centers

Patents

Publication