Publication Type : Conference Paper
Publisher : IEEE
Source : International Conference on Intelligent Systems for Communication, IoT and Security (ICISCoIS), Coimbatore, India, 2023, pp. 253-256, doi: 10.1109/ICISCoIS56541.2023.10100390. IEEE Xplore
Url : https://ieeexplore.ieee.org/document/10100390
Campus : Coimbatore
School : School of Computing
Year : 2023
Abstract : Automatic speech recognition (ASR) has expanded into more contexts recently due to the prevalence of smart gadgets. In a noisy setting, visual speech recognition, often known as lip reading, can be an important component of automatic speech recognition (ASR). Because it operates in silence, Visual Speech Recognition (VSR) is an integral part of Audio Visual Speech Recognition Systems (AVSR). VSR systems are used in places where there is a lot of background noise, while driving a car or using a cell phone. The VSR system utilizes the tried-and-true methods of statistics and machine learning, as well as the cutting-edge technique of deep learning. By employing an encoder-decoder attention based method, the proposed visual voice recognition system reduces the word error rate (WER) to around 2.8% on the benchmark GRID corpus and 40.1% on LRS2 corpus. Liptype and lipnet, two SOTA methods, are used to evaluate the outcomes.
Cite this Research Publication : A. Kumar, D. K. Renuka and M. C. S. Priya, "Towards Robust Speech Recognition Model Using Deep Learning," 2023 International Conference on Intelligent Systems for Communication, IoT and Security (ICISCoIS), Coimbatore, India, 2023, pp. 253-256, doi: 10.1109/ICISCoIS56541.2023.10100390. IEEE Xplore