English-Assamese Neural Machine Translation using Prior Alignment and Pre-trained Language Model

Publication Type : Journal Article

Source : Computer Speech & Language Journal

Url : https://www.sciencedirect.com/science/article/abs/pii/S0885230823000438

Campus : Amaravati

School : School of Computing

Year : 2023

Abstract : In a multilingual country like India, automatic natural language translation plays a key role in building a community with different linguistic people. Many researchers have explored and improved the translation process for high-resource languages such as English, German, etc., and achieved state-of-the-art results. However, the unavailability of adequate data is the prime obstacle to automatic natural language translation of low-resource north-eastern Indian languages such as Mizo, Khasi, and Assamese. Though the recent past has witnessed a deluge in several automatic natural language translation systems for low-resource languages, the low values of their evaluation measures indicate the scope for improvement. In the recent past, the neural machine translation approach has significantly improved translation quality, and the credit goes to the availability of a huge amount of data. Subsequently, the neural machine translation approach for low-resource language is underrepresented due to the unavailability of adequate data. In this work, we have considered a low-resource English–Assamese pair using the transformer-based neural machine translation, which leverages the use of prior alignment and a pre-trained language model. To extract alignment information from the source–target sentences, we have used the pre-trained multilingual contextual embeddings-based alignment technique. Also, the transformer-based language model is built using monolingual target sentences. With the use of both prior alignment and a pre-trained language model, the transformer-based neural machine translation model shows improvement, and we have achieved state-of-the-art results for the English-to-Assamese and Assamese-to-English translation, respectively.

Cite this Research Publication : Sahinur Rahman Laskar, Bishwaraj Paul, Pankaj Dadure, Riyanka Manna, Partha Pakray, and Sivaji Bandyopadhyay, English-Assamese Neural Machine Translation using Prior Alignment and Pre-trained Language Model, Computer Speech & Language Journal, ISSN 0885-2308. Impact Factor: 3.252. [2023].

About Amrita Vishwa Vidyapeetham

Rankings

Accreditation

Governance

Chancellor

Leadership

Press Media

Newsletters

Amritapuri
Campus

Amaravati
Campus

Bengaluru
Campus

Chennai
Campus

Coimbatore
Campus

Faridabad
Campus

Kochi
Campus

Mysuru
Campus

Nagercoil
Campus

Research

Centers

Patents

Publication