Publication Type : Conference Proceedings
Publisher : IEEE
Url : https://ieeexplore.ieee.org/document/10028882/
Campus : Amaravati
School : School of Computing
Year : 2022
Abstract : Multimodal machine translation (MMT) handles extracting information from several modalities, considering the presumption that the extra modalities will include beneficial alternative perspectives of the input data. Regardless of its significant benefits, it is challenging to implement an MMT system for several languages, mainly due to the scarcity of the availability of multimodal datasets. As for the low-resource English-Mizo pair, the standard multimodal corpus is not available. Therefore, in this paper, we have developed a Mizo Visual Genome 1.0 (MVG 1.0) dataset for English-Mizo MMT, including images with corresponding bilingual textual descriptions. According to automated assessment measures, the performance of multimodal neural machine translation (MNMT) is better than text-only neural machine translation. To the best of our knowledge, our English-Mizo MMT system is the pioneering work in this approach, and as such, it can serve as a baseline for future study in MMT for the low-resource English-Mizo language pair.
Cite this Research Publication : Vanlalmuansangi Khenglawt, Sahinur Rahman Laskar, Riyanka Manna, Partha Pakray and Ajoy Kumar Khan ,Mizo Visual Genome 1.0: A Dataset for English-Mizo Multimodal Neural Machine Translation', Track 5: Artificial Intelligence, Data Science & Computing, IEEE SILCON 2022, 4-6 November 2022, National Institute of Technology Silchar.