Back close

Generating Audio from Lip Movements Visual Input: A Survey

Publication Type : Conference Paper

Publisher : Springer

Source : Intelligent Systems, Technologies and Applications: Proceedings of Sixth ISTA 2021, India, Springer, DOI: https://doi.org/10.1007/978-981-16-0730-1_21, 2021.

Url : https://link.springer.com/chapter/10.1007/978-981-16-0730-1_21

Campus : Amritapuri

School : School of Computing

Center : Computer Vision and Robotics

Year : 2021

Abstract : Generating audio from visual scene is an extremely challenging yet useful task as it finds application in remote surveillance, comprehending speech for hearing impaired people, or in silent speech interface (SSI). Due to the recent advancements of deep neural network techniques, there have been considerable research effort toward speech reconstruction from silent videos or visual speech. In this survey paper, we review several recent papers in this area and make a comparative study in terms of their architectural models and accuracy achieved.

Cite this Research Publication : Krishna Suresh, G Gopakumar, Subhasri Duttagupta, "Generating Audio from Lip Movements Visual Input: A Survey,"  Intelligent Systems, Technologies and Applications: Proceedings of Sixth ISTA 2021, India, Springer, DOI: https://doi.org/10.1007/978-981-16-0730-1_21, 2021.

Admissions Apply Now