Generating Audio from Lip Movements Visual Input: A Survey

Publication Type : Conference Paper

Publisher : Springer

Source : Intelligent Systems, Technologies and Applications: Proceedings of Sixth ISTA 2021, India, Springer, DOI: https://doi.org/10.1007/978-981-16-0730-1_21, 2021.

Url : https://link.springer.com/chapter/10.1007/978-981-16-0730-1_21

Campus : Amritapuri

School : School of Computing

Center : Computer Vision and Robotics

Year : 2021

Abstract : Generating audio from visual scene is an extremely challenging yet useful task as it finds application in remote surveillance, comprehending speech for hearing impaired people, or in silent speech interface (SSI). Due to the recent advancements of deep neural network techniques, there have been considerable research effort toward speech reconstruction from silent videos or visual speech. In this survey paper, we review several recent papers in this area and make a comparative study in terms of their architectural models and accuracy achieved.

Cite this Research Publication : Krishna Suresh, G Gopakumar, Subhasri Duttagupta, "Generating Audio from Lip Movements Visual Input: A Survey," Intelligent Systems, Technologies and Applications: Proceedings of Sixth ISTA 2021, India, Springer, DOI: https://doi.org/10.1007/978-981-16-0730-1_21, 2021.

About Amrita Vishwa Vidyapeetham

Rankings

Accreditation

Governance

Chancellor

Leadership

Provost

Press Media

Newsletters

Amritapuri
Campus

Amaravati
Campus

Bengaluru
Campus

Chennai
Campus

Coimbatore
Campus

Faridabad
Campus

Kochi
Campus

Mysuru
Campus

Nagercoil
Campus

Research

Centers

Patents

Publications