Publication Type : Conference Paper
Publisher : Springer
Source : Intelligent Systems, Technologies and Applications: Proceedings of Sixth ISTA 2021, India, Springer, DOI: https://doi.org/10.1007/978-981-16-0730-1_21, 2021.
Url : https://link.springer.com/chapter/10.1007/978-981-16-0730-1_21
Campus : Amritapuri
School : School of Computing
Center : Computer Vision and Robotics
Year : 2021
Abstract : Generating audio from visual scene is an extremely challenging yet useful task as it finds application in remote surveillance, comprehending speech for hearing impaired people, or in silent speech interface (SSI). Due to the recent advancements of deep neural network techniques, there have been considerable research effort toward speech reconstruction from silent videos or visual speech. In this survey paper, we review several recent papers in this area and make a comparative study in terms of their architectural models and accuracy achieved.
Cite this Research Publication : Krishna Suresh, G Gopakumar, Subhasri Duttagupta, "Generating Audio from Lip Movements Visual Input: A Survey," Intelligent Systems, Technologies and Applications: Proceedings of Sixth ISTA 2021, India, Springer, DOI: https://doi.org/10.1007/978-981-16-0730-1_21, 2021.