Translation Tools and their Caveats – Translate Science Blog

Translation tools (often referred to as CAT tools for ‘computer aided translation’) are a great means of streamlining some of the elements of a translation process, such as checking terminology or retrieving existing translations (so-called Translation Memories). Modern versions of these tools allow for a web-based, collaborative translation, giving collaborators such possibilities as revising and/or commenting proposed translations, evaluating existing translations or adding machine translation (henceforth MT) support. Modern MT systems are based on artificial neural networks, which have boosted quality considerably since roughly the mid-2010s.

CAT tools, with or without added machine translation support, have been studied from various angles. While they in general increase efficiency and often ease the task of translation as translators do not have to start from scratch, there are some caveats to be kept in mind when working with them. Here are some of the more important ones.

A major problem which has been described is lack of consistency. This does not only extend to the terminological level as, e.g., shown by (Čulo & Nitzke 2016), but a system may suddenly change in the output style, switching between different forms of addressing readers, for instance. A problem which is sometimes also attributed to how CAT tools display source and target text (mostly in segments of sentences, aligned left-to-right) is that translators do not necessarily spot these inconsistencies, a sort of peephole effect, as they check sentence by sentence and thus do not easily perceive the text as a whole in their revision. Sentence-by-sentence evaluation is also the reason why MT systems often used to score better in their evaluation than they deserved and sometimes still do (see, e.g., Castilho 2021; Krüger 2022): Being evaluated by means of checking translations of single sentences only, inconsistencies are not spotted and thus not penalised.

A second very serious problem, as known from other fields of AI, is that neural MT systems reproduce biases that are implicitly or explicitly encoded in the training texts, a notable issue being gender bias. When translating from a language that has little or no grammatical gender such as English into a language such as German which differentiates between a grammatical ‘masculine’, ‘feminine’ and ‘neuter’ gender (which often, but not necessarily coincide with (supposed) biological sex for nouns referring to humans), this shows: Try translating “sexy pianist” and “clever pianist” into German with MT systems like DeepL. At the time of writing the first version of these notes, the former translates into “sexy Pianistin” (feminine gender), the latter into “geschickter Pianist” (masculine gender). Also, gendering across a text can be wildly inconsistent. And highlighting the non-deterministic and adaptive nature of such systems, the results can actually vary not only between systems, but even for one system over time.

Third, watch out for missing or even spurious additional text. Koehn (2017) describes some of the challenges of early neural machine translation research, some of which have been addressed in the meantime, but an important one remains: MT hallucination, or also called MT fiction. Neural MT systems basically operate by trying to predict the next most likely output based on previous input (which, in principle, is the same mechanism that allows for search completion in a web search bar). Take a moment to reflect on the options you are given in a search query completion: some of them may be very fitting, others quite nonsensical. Modern MT systems have become very good at picking out the fitting options, but when they cannot ‘make sense’ of the input, they may omit something, just try to ‘guess’ or even add stuff that is not there in the source text.

Last but not least, data ethics should be raised as an issue here. Note that for web-based CAT tools and/or machine translation systems (also those that you can plug into your locally installed CAT tool), the source text will be copied over to and processed by multiple other machines. Even if you have the permission to produce a translation that is accessible under more liberal terms, this can technically be a violation of copyright for the source text if it falls under stricter copyright terms. Anonymization of people which may not have been much of an issue for printed, narrowly distributed material can also pose an issue in such settings, even if you chose to perform anonymization for the target text. Ecological matters may apply as well, giving rise to the question how often and at which stage(s) MT should be used: it requires, after all, quite a bit of computing power. For a more in-depth discussion of ethics and the use of machine translation, see (Moorkens 2022).

Background to translation

How can we help?