Adapting the Generic English-Croatian NMT Model to a Religious Domain

Autori

Marija Brkić Bakarić
Fakultet informatike i digitalnih tehnologija, Sveučilište u Rijeci
https://orcid.org/0000-0003-4079-4012 ##orcid.unauthenticated##
Lucia Načinović Prskalo
Fakultet informatike i digitalnih tehnologija, Sveučilište u Rijeci
https://orcid.org/0000-0002-8832-2527 ##orcid.unauthenticated##
Košuta Estera Lerga
Filozofski fakultet, Sveučilište u Rijeci

Sažetak

Recent discoveries in the field of artificial intelligence have significantly impacted various professions, including the translation industry, leading to notable changes in translators’ work processes. The study presented in this article indicates that today any translator, even those without advanced IT skills, can develop a higher quality Neural Machine Translation (NMT) system based on their own texts. This paper evaluates Google’s AutoML Translation service, which enables users to train high-quality models using their own text data. Specifically, AutoML Translation integrates an additional layer that tailors the generic Translation API model to a specific domain. The training process involves providing a user-defined dataset containing aligned sentences in the source and target languages. Google’s AutoML Translation service was used to adapt the base English-Croatian Google NMT model to the field of religion. Following a brief introduction to machine translation, this paper outlines the key aspects of the training and evaluation processes. Additionally, it presents two corpora employed in the training phase. The results demonstrate that a customized model outperforms the base model, as evidenced by the BLEU score.

Preuzimanja

Objavljeno

09.01.2025.