Depressed plastic robot

Translation tools (often referred to as CAT tools for ‘computer aided translation’) are a great means of streamlining some of the elements of a translation process, such as checking terminology or retrieving existing translations (so-called Translation Memories). Modern versions of these tools allow for a web-based, collaborative translation, giving collaborators such possibilities as revising and/or commenting proposed translations, evaluating existing translations or adding machine translation (henceforth MT) support. Modern MT systems are based on artificial neural networks, which have boosted quality considerably since roughly the mid-2010s.

CAT tools, with or without added machine translation support, have been studied from various angles. While they in general increase efficiency and often ease the task of translation as translators do not have to start from scratch, there are some caveats to be kept in mind when working with them. Here are some of the more important ones.

A major problem which has been described is lack of consistency. This does not only extend to the terminological level as, e.g., shown by (Čulo & Nitzke 2016), but a system may suddenly change in the output style, switching between different forms of addressing readers, for instance. A problem which is sometimes also attributed to how CAT tools display source and target text (mostly in segments of sentences, aligned left-to-right) is that translators do not necessarily spot these inconsistencies, a sort of peephole effect, as they check sentence by sentence and thus do not easily perceive the text as a whole in their revision. Sentence-by-sentence evaluation is also the reason why MT systems often used to score better in their evaluation than they deserved and sometimes still do (see, e.g., Castilho 2021; Krüger 2022): Being evaluated by means of checking translations of single sentences only, inconsistencies are not spotted and thus not penalised.

A second very serious problem, as known from other fields of AI, is that neural MT systems reproduce biases that are implicitly or explicitly encoded in the training texts, a notable issue being gender bias. When translating from a language that has little or no grammatical gender such as English into a language such as German which differentiates between a grammatical ‘masculine’, ‘feminine’ and ‘neuter’ gender (which often, but not necessarily coincide with (supposed) biological sex for nouns referring to humans), this shows: Try translating “beautiful pianist” and “clever pianist” into German with MT systems like DeepL. At the time of writing the first version of these notes, the former translates into “schöne Pianistin” (feminine gender), the latter into “geschickter Pianist” (masculine gender). Also, gendering across a text can be wildly inconsistent. And highlighting the non-deterministic and adaptive nature of such systems, the results can actually vary not only between systems, but even for one system over time.

Third, watch out for missing or even spurious additional text. Koehn (2017) describes some of the challenges of early neural machine translation research, some of which have been addressed in the meantime, but an important one remains: MT hallucination, or also called MT fiction. Neural MT systems basically operate by trying to predict the next most likely output based on previous input (which, in principle, is the same mechanism that allows for search completion in a web search bar). Take a moment to reflect on the options you are given in a search query completion: some of them may be very fitting, others quite nonsensical. Modern MT systems have become very good at picking out the fitting options, but when they cannot ‘make sense’ of the input, they may omit something, just try to ‘guess’ or even add stuff that is not there in the source text.

Last but not least, data ethics should be raised as an issue here. Note that for web-based CAT tools and/or machine translation systems (also those that you can plug into your locally installed CAT tool), the source text will be copied over to and processed by multiple other machines. Even if you have the permission to produce a translation that is accessible under more liberal terms, this can technically be a violation of copyright for the source text if it falls under stricter copyright terms. Anonymization of people which may not have been much of an issue for printed, narrowly distributed material can also pose an issue in such settings, even if you chose to perform anonymization for the target text. Ecological matters may apply as well, giving rise to the question how often and at which stage(s) MT should be used: it requires, after all, quite a bit of computing power. For a more in-depth discussion of ethics and the use of machine translation, see Moorkens (2022)

This is the second post to help people make and publishing translations of scientific works. The first post gave some “theoretical background to translation“.

Literature

Castilho, Sheila. 2021. ‘Towards document-level human MT evaluation: On the issues of annotator agreement, effort and misevaluation’. In Proceedings of the Workshop on Human Evaluation of NLP Systems (HumEval), 34–45. Online: Association for Computational Linguistics. https://aclanthology.org/2021.humeval-1.4.

Čulo, Oliver, and Jean Nitzke. 2016. ‘Patterns of Terminological  Variation in Post-Editing and of Cognate Use in Machine  Translation in Contrast to Human  Translation’. Baltic Journal of Modern Computing 4 (2): 106–14. https://aclanthology.org/W16-3401.pdf

Koehn, Philipp, and Rebecca Knowles. 2017. ‘Six Challenges for Neural Machine Translation’. In Proceedings of the First Workshop on Neural Machine Translation, 28–39. Vancouver: Association for Computational Linguistics. https://doi.org/10.18653/v1/W17-3204.

Krüger, Ralph. 2022. ‘Some Translation Studies Informed Suggestions for Further Balancing Methodologies for Machine Translation Quality Evaluation’. Translation Spaces, March. https://doi.org/10.1075/ts.21026.kru.

Moorkens, Joss. 2022. ‘Ethics and Machine Translation’. In Machine Translation for Everyone: Empowering Users in the Age of Artificial Intelligence, edited by Dorothy Kenny, 121–40. Translation and Multilingual Natural Language Processing 18. Language Science Press. https://zenodo.org/record/6653406.

Top photo of robot by Hello Robotics. This file is licensed under the Creative Commons Attribution-Share Alike 4.0 International license.

Photo of a backlit keyboard with a person typing

This is the first of two blog posts with general notes on how to approach the task of translating science, touching upon the most prevalent basic notions and advice relevant to the task. The two main points presented in this and the upcoming post are (a) an introduction into a present-day functionalist view of translation which provides for a wide range of purpose-driven strategic translation options and (b) key caveats when making use of digital support tools for translation including machine translation. These general notes are meant for people who read academic texts at a postgraduate level and have experience with scholarly publishing, but may have little to no experience in translation. As technological tools are nowadays omnipresent in translation processes, they have been comprised here under basic background to translation.

Translation

Translation is a cluster concept (Tymoczko 2005) that is constituted by various cultural practices with complex overlapping similarities. This includes what is sometimes referred to as ‘translation proper’, i.e. ‘transferring’ a (mostly) written source text from one language to a target text in another language. Interpreting, i.e. the ‘transfer’ of (mostly) spoken language is part of the cluster concept, just as well as localization – of software, video games and the like – or sur-/subtitling, transcreating etc. In the following, the terms translation and translate shall include all these practices.

On a side note: It is exactly this understanding of translation as a cluster of cultural practices which opens up the possibility of studying not just the linguistic differences between two texts, but the whole range of patterns of practices and power concerning translation, including, but not limited to such questions as what is translated and who commissions translations, what conscious and subconscious translation strategies are being taught and applied, how censorship and translation interact, etc. etc. This wiki page introduces key concepts and issues that inform the pragmatics of translating a specific text.

Functionalism and translation strategies

Functionalist theories of translation (see, e.g.,Vermeer 1989) have highlighted that translation is a purposeful activity, i.e. it is text production with a goal and an audience, with a precursor, the source text, which may require different levels of adaptation to the target culture. Nord (1989) introduces the spectrum between documentary and instrumental translation: The former is meant to highlight the original make-up of the source text with interlinear glosses being an extreme form, the latter aims at producing a text which is meant to act as a target culture text and should not be discernible from original texts. All in all, a functionalist approach to translation offers us a wide array of translation strategies, keeping in mind that, following Nord, we should remain loyal to both the creators of the source text as well as the intended audience of the target text.

Applied to the purpose of translating science, you might ask yourself, for instance, how to go about the subtitling of a video which introduces a scientific topic. While, of course, you will want to get the terminology and the science right, do think about what the idea of the source text is: Is it purely informatory or does the video at hand also aim to entertain? Assuming it does, what is your goal in translation: Do you mostly care about the science or do you want to entertain as well? What you probably will not want is a ‘close’ translation in a structural sense, i.e. trying to mimic the syntactic or lexical structures of the source text – unless you are aiming, e.g., at documenting which linguistic strategies can be used in a certain language for edutainment videos. Another example is that of the translation of Bron Taylor’s book “Dark green religion” into German, where the author explicitly encouraged the translator Kocku von Stuckrad to add comments explaining how historical US-related circumstances compare to those in Germany, making the text more accessible to a German audience (von Stuckrad in Taylor 2020: 303).

Technical translation and cultural influences

A common misconception is that terminology (or language on the whole) in the natural and engineering sciences is near-‘objective’ in a sense that it fosters a ‘simple’ one-to-one transfer between languages. However, cultural influences abound also in technical language, influencing terminology, phraseology, style, text structures, argumentation patterns etc. Cultural influence here does not solely refer to the larger setting of regional, national, areal or global cultures, but also to cultures of specific scientific fields and subfields (i.e., shared assumptions, traditions, practices, etc.). Even within a language, creating, e.g., something like a common terminology may be quite an undertaking especially in younger fields of research (see, e.g., Avizienis et al. 2004 for the field of dependable and fault tolerant computing). Between languages, even slightest differences in conceptualizations and uses can pose a challenge. On top of this, influence of larger cultural contexts is omnipresent not just in the humanities or social sciences, with the discussion about the English master/slave terminology in computing and electrical engineering as a very prominent and illustrative example (Charboneau, 2020). As pointed out above, these differences may extend to other linguistic levels such as phraseology, argumentation patterns or text structures, in some cases giving rise to strategies of translation which are often subsumed under adaptation, i.e. making deep(er) changes to the make-up of a (stretch of) text in order to make it more target culture adequate and fitting to the purpose, which can be quite in line with Nord’s principle of loyalty. Whichever strategy you choose, be aware of these cultural factors even in technical language.

Translating science

Who can translate science?

Translation is very likely more often than not: co-creation. Professional translators will have learned how to quickly adapt to the terminology, phraseology, style of a field, how to invent new terminology, how to perform effective research in cases of doubt and – actually one of the most challenging and frequent problems in translation – how to deal with faulty, ambiguous or badly formulated (stretches of) source texts; but technical expertise is still often required for translation, inside as much as outside of translating science. Many works are translated by (groups of) people with domain knowledge and the necessary linguistic competences, and it is not unusual to have MA or PhD translation students as well as career jumpers from completely different fields than linguistics given a certain background in their languages and cultures of interest.

This provides us with a number of options when it comes to the question of who could translate science: It could be scientists, alone or in groups with complementary domain or language skills; some institutions might even have translation services that can spare at least some time to (aid) translate science; or some stakeholders might have money on the side to commission translation. In all cases, however, the domain knowledge of scientists will be crucial, and should you be in the position to commission a translation, be prepared to answer questions on linguistic and other aspects of the field in question.

Aiding (commissioned) translation / translators

In any case, for a commissioned translation be prepared to act as the domain expert as a scientist. You can aid the linguistic side of a (commissioned) translation if you have some sort of terminology (e.g. any dictionary for your field that you have at hand) at the disposal of those who translate, or if you have a collection of texts (ideally in all languages involved) which you can make available so that term candidates and collocation patterns can be extracted quickly by means of the appropriate tools (see, e.g., on this wiki; professional translators should have acquired access to such tools). If you have commissioned a translation, the use of tools which allow for collaborative work can be a great help, e.g. in order to quickly comment on questions translators might have.

Literature

Avizienis, Algirdas, J-C Laprie, Brian Randell, and Carl Landwehr. 2004. ‘Basic Concepts and Taxonomy of Dependable and Secure Computing’. IEEE Transactions on Dependable and Secure Computing 1 (1): 11–33.

Charboneau, Tyler. 2020. ‘How “Master” and “Slave” Terminology Is Being
Reexamined in Electrical Engineering – News’. Accessed 1 August 2022.
https://www.allaboutcircuits.com/news/how-master-slave-terminology-reexamined-in-electrical-engineering/

Nord, Christiane. 1989. ‘Loyalität Statt Treue. Vorschläge zu einer funktionalen Übersetzungstypologie’. Lebende Sprachen, no. 3: 100–105.

Taylor, Bron. 2020. Dunkelgrüne Religion: Naturspiritualität und die Zukunft des Planeten. Translated by Kocku von Stuckrad. Leiden, Netherlands: Brill, Wilhelm Fink.

Tymoczko, Maria. 2005. ‘Trajectories of Research in Translation Studies’. Meta 4 (50): 1082–97.1 Vermeer, Hans J. 1989. ‘Skopos and commission in translational action’. In Readings in Translation Theory, edited by Andrew Chesterman, 173–87. Helsinki: Oy Finn Lectura AB.

Top photo of keyboard made by Colin / Wikimedia Commons / CC BY-SA 4.0

Vasconcelos public library in Mexico City

Artikel ini juga tersedia dalam bahasa Indonesia. (Dieser Artikel ist auch in indonesischer Sprache verfügbar.) This article is also available in English. (Dieser Artikel ist auch in einer englischen Fassung verfügbar.) Cet article est aussi disponible en français. (Dieser Artikel ist auch in einer französischen Fassung verfügbar.)

Die Übersetzung wissenschaftlicher Arbeiten kann einen wichtigen Beitrag für wissenschaftliche Gemeinschaften leisten. Beispielsweise hatte Albert Einstein einige Artikel ins Englische übersetzt, damit angloamerikanische Kolleg*innen zum neuesten Stand der Wissenschaft beitragen konnten. Die moderne Tendenz, wissenschaftliche Arbeiten, die nicht in englischer Sprache verfasst sind, zu ignorieren, führt zu schlechteren Studien und doppelter Arbeit. Die Übersetzung kann dazu beitragen, sprachliche Barrieren zu überwinden, und ist daher ein wichtiges Mittel, um die Zugänglichkeit und die Teilhabe zu verbessern und der Zersplitterung der Literatur in sprachliche Inseln entgegenzuwirken.

Das Übersetzen öffnet die Wissenschaft auch für die Gesellschaft, für die sieben Milliarden Menschen, die kein Englisch sprechen, und für viele weitere, die Englisch nicht als Muttersprache haben. Darunter sind viele Menschen, die direkt zur Wissenschaft beitragen, indem sie Daten sammeln, Menschen, die auf Probleme hinweisen, die erforscht werden müssen, oder Menschen, die die Früchte der Wissenschaft nutzen (Lehrer/Ausbilder, Architekten, Ingenieure, Ärzte, Aktivisten, politische Entscheidungsträger, Journalisten usw.) und dadurch den gesellschaftlichen Nutzen der Wissenschaft gewähren.

Potenzielle Übersetzer

Solcherlei Übersetzungen können von professionellen Übersetzern angefertigt und in einer speziellen Zeitschrift veröffentlicht werden, von Übersetzern, die an Universitäten oder Instituten der Autoren angestellt sind, von den Autoren selbst oder von Freiwilligen, die sie in einem Repository oder einer Zeitschrift veröffentlichen. Wir haben in der DOI-Datenbank von CrossRef Tausende von übersetzten Werken gefunden, die in Zeitschriften veröffentlicht wurden. Die Suche nach Übersetzungen auf Repositorien zeigt, dass es dort noch Tausende weitere gibt, die höchstwahrscheinlich von Wissenschaftlern für ihre Gemeinschaft erstellt werden. Mit diesem Beitrag möchten wir Wissenschaftler dazu ermutigen, ihre eigenen Werke und die ihrer Kolleg*innen zu übersetzen und zu veröffentlichen und die Übersetzung mit dem Originaldokument zu verbinden.

Einer von uns (Victor) ist Klimaforscher, der sich mit der Qualität von Klimastationsdaten befasst, und es wäre nützlich, wenn einige seiner Artikel in die Bildungssprachen der ganzen Welt übersetzt würden, damit sich Wetterbeobachter der Qualitätsproblematik stärker bewusst werden. Allerdings spricht Victor die nicht-englischen Sprachen der Weltorganisation für Meteorologie (Arabisch, Chinesisch, Französisch, Russisch und Spanisch) nicht. Kollegen, die über Fachwissen auf diesem Gebiet verfügen und diese Sprachen beherrschen, hätten die notwendigen Kernkompetenzen, um sie zu übersetzen.

Andererseits gibt es Fälle, in denen Geowissenschaftler, die in einer bestimmten Region tätig sind, sowohl die Sprache dieser Region als auch Sprachen mit größerer Verbreitung als Englisch sprechen. Da geowissenschaftliche Informationen sehr lokal sein können, z.B. Informationen über Georisiken in abgelegenen Gebieten, wäre es für die Menschen vor Ort Gemeinschaft von großem Nutzen, die Art der Georisiken in ihrem Gebiet zu verstehen. Um ihre Arbeit einem breiteren Publikum zugänglich zu machen, könnten Wissenschaftler*innen ihre Arbeit in eine Sprache übersetzen, die sie beherrschen. Eine noch größere Reichweite ließe sich erzielen, wenn die Übersetzung in Form eines Non-Paper-Produkts, z. B. eines kurzen Videos oder Podcasts, erstellt würde. Darüber hinaus würde es die Wahrscheinlichkeit erhöhen, dass Journalisten und Blogger über diese Themen schreiben, wenn eine Version dieses Non-Paper-Produkts auch in der Regionalsprache verfügbar wäre.  

Akademiker und Autoren

Um eine Übersetzung anzufertigen, kann mit Hilfe maschineller Übersetzung für zahlreiche Sprachkombinationen schnell ein guter erster Entwurf erstellt werden. Die Veröffentlichung von Übersetzungen macht die Arbeit lohnender, da mehr Menschen sie leichter finden können. (Wir arbeiten an einem Übersetzungsvermittlungsdienst, um das Auffinden von Übersetzungen noch einfacher zu machen). Die Übersetzungen können in einem Manuskript-Repository veröffentlicht werden; fast alle Repositories unterstützen dies. Noch besser wäre es, die Übersetzungen in Zeitschriften zu veröffentlichen. Dadurch würden sie Teil des akademischen Leistungsbewertungssystems werden. Sie könnten in die Kategorie “Community Outreach” eingeordnet werden. Die Übersetzungen wären leichter auffindbar, und ein Peer Reviewer, der die Übersetzung überprüft, würde das Vertrauen in die Qualität der Übersetzung erhöhen. Schon das Verfassen einer Übersetzung des Abstracts ist von großem Wert, da der Artikel dadurch besser auffindbar wird. Wenn der Artikel in einem Repository veröffentlicht wird, kann man mehrere Zusammenfassungen hinzufügen. Eine Zusammenfassung eignet sich auch für einen kurzen Blogbeitrag.

Zeitschriften

In dem “Publikationswahn” des derzeitigen Forschungsbewertungssystems kann es sich für (nationale) Zeitschriften lohnen, Übersetzungen zu veröffentlichen. Der Journal Impact Factor (JIF) errechnet sich aus den Zitaten der in den beiden vorangegangenen Jahren veröffentlichten Ausgaben geteilt durch die Anzahl der in diesem Zweijahreszeitraum veröffentlichten “Forschungsartikel”. Dies ist ein Grund dafür, dass Zeitschriften immer mehr Leitartikel (“editorials”) veröffentlichen, bei denen man sich fragt, warum jemand seine kostbare Zeit auf der Erde verschwendet hat, sie zu schreiben. Leitartikel werden zwar zitiert, zählen aber nicht als “Forschungsartikel” und erhöhen also den JIF. Es ist immer eine “Verhandlungssache” zwischen der bibliografischen Datenbank und der Zeitschrift, was als “Forschungsgegenstand” zählt (die Zusendung von Schokolade kann dabei helfen). Aber man kann argumentieren, dass eine Übersetzung keine Originalforschung ist und nicht als “Forschungsgegenstand” gezählt werden sollte, während ihre Zitate für die Zeitschrift zählen.

Viele nicht-englischsprachige Zeitschriften haben bereits englische Zusammenfassungen. Dazu möchten wir explizit ermuntern.

Datenbanken

Die Hersteller von bibliografischen Indizes und Datenbanken wie dem Web of Science sowie von Hochschulrankings könnten die Erstellung von Übersetzungen fördern, indem sie die Zitate der Übersetzung in der Zitationszahl des Originals aufnehmen. Wenn die Übersetzung in einer indizierten Zeitschrift erscheint, würde dies zwar nicht die Gesamtzahl der Zitate für eine*n Forscher*in erhöhen, aber es könnte seinen*ihren h-Index steigern.

Veröffentlichungsregeln

Im Zeitalter des “publish-or-perish”-Prinzips versuchen Autoren manchmal, eine Übersetzung als Originalstudie zu verkaufen und kennzeichnen das Dokument nicht eindeutig als Übersetzung. Dies hat den Übersetzungen in manchen Kreisen einen schlechten Ruf, der vermieden werden sollte, eingehandelt. Das Committee on Publication Ethics (COPE) berät wissenschaftliche Zeitschriften und weist in seinen Leitlinien für redundante Veröffentlichungen darauf hin, dass Übersetzungen nicht als Duplikate betrachtet werden sollten, was ein ethisches Problem wäre, sondern als gültige Veröffentlichungen. COPE weist darauf hin, dass solche sekundären oder abgeleiteten Veröffentlichungen als Übersetzung gekennzeichnet werden sollten, indem auf die Originalversion verwiesen wird. COPE schreibt:

ICMJE [International Committee of Medical Journal Editors] empfiehlt, dass Übersetzungen akzeptabel sind, aber auf das Original verweisen MÜSSEN. (ICMJE advises that translations are acceptable but MUST reference the original.)

Selbst wenn der übersetzte Artikel das Original zitiert, ist dies in einer Publikationsliste nicht ersichtlich. Es ist daher gute Praxis, bereits im Titel zu erwähnen, dass es sich um eine Übersetzung handelt. Die ICMJE empfiehlt:

Der Titel der Sekundärpublikation sollte darauf hinweisen, dass es sich um eine Sekundärpublikation (vollständige oder gekürzte Neuauflage oder Übersetzung) einer Primärpublikation handelt. (The title of the secondary publication should indicate that it is a secondary publication (complete or abridged republication or translation) of a primary publication.)

Sagen Sie es weiter

Ob Sie nun Autor, Herausgeber, Gutachter oder angestellter Wissenschaftler sind, sehen Sie bitte Übersetzungen als Forschungsergebnisse an und öffnen Sie die Wissenschaft, indem Sie mehr Übersetzungen anfertigen, um den Fortschritt bei der Wissensverbreitung zu beschleunigen.

* Das Foto oben zeigt die Biblioteca Vasconcelos, die Megabiblioteca (“Megabibliothek”) im Stadtzentrum von Mexiko-Stadt.

Vasconcelos public library in Mexico City

This article is also available in Indonesian. Artikel ini juga tersedia dalam bahasa Indonesia.

This article is also available in French. Cet article est aussi disponible en français.

Translating scholarly works can contribute enormously to a scientific community. Famously, Albert Einstein translated articles into English so that Anglo-Americans could contribute to state-of-the-art science. The modern tendency to ignore scholarship that is not in English leads to lower quality studies and double work. Translation can help overcome linguistic barriers, and is thus an important means to increase accessibility and participation as well as to counteract fragmentation of the literature into linguistic islands.

Translation also opens science to society, to the seven billion people who do not speak English and many more who do not use it as their first language. Among them are many people who contribute directly to science by collecting data, people who point to problems that need research, or people who need to use the fruits of science (teachers/trainers, architects, engineers, doctors, activists, policy makers, journalists, etc.) and amplify the societal benefit of science.

Potential translators

This translation can be done by professional translators and published in dedicated journals, by translators employed by universities or institutes of the authors, by the authors themselves, or by volunteers and published in a repository or journal. We have found thousands of translated works published in journals in the CrossRef DOI database. Searching for translations on repositories, it looks like there are thousands more there which are most likely produced by scholars for their community. With this post we would like to encourage scholars to translate and publish their own works and works of colleagues, and to connect the translation to the original document.

One of us (Victor) is a climatologist working on the quality of climate station data and it would be useful for some of his articles to be translated into the languages of higher education all over the world so that weather observers are more aware of quality issues. However, Victor does not speak or is not fluent in the non-English World Meteorological Organization languages (Arabic, Chinese, French, Russian and Spanish). Colleagues who have the expertise in the field and do master these languages would have the necessary core skills to translate them.

On the other hand, there are cases like geo-scientists working on a specific region who will speak the language of that region as well as languages with higher diffusion such as English. As geoscience information can be very local, e.g. geohazard information in remote areas, it would be highly beneficial to the local community to understand the nature of geohazard in their area. In order to make their work available to a broader audience, they could translate their work into a language with higher diffusion they command. Even more outreach could be gained by crafting the translation in a non-paper product, like a short video or podcast. It would, on top of this, increase the chance of journalists and bloggers writing about these topics if a version of this non-paper product was also available in the local language.  

Scholars and authors

If you aim to produce a translation, you could get a jump start  by means of using machine translation which can offer a decent first draft for quite a few language pairs. Publishing translations makes the work more worthwhile as more people will more easily find it. (We are working on a Translations Switchboard to make finding translations even easier.) The translations can be published on a manuscript repository; nearly all repositories support this. Even better would be the publication of translations in journals. This would make them part of the academic credit system that can be classified in community outreach category, would make it easier to find them, and a peer reviewer revising the translation would increase trust in the translation quality. Even just writing a translation of the abstract already adds a lot of value as it makes the article more findable. If the article is published in a repository, one can add multiple abstracts. An abstract would also make a nice short blog post.

Journals

In the publish-or-perish madness of the current research evaluation system it may be profitable for (national) journals That do not yet do so to publish translations. The Journal Impact Factor (JIF) is calculated as the citations of issues published in the two previous years divided by the number of ‘research items’ published in that two-year period. This is one reason why journals publish increasingly many editorials where you wonder why someone wasted their precious time on Earth to write them. Editorials are cited, but do not count as ‘research item’. It is always up to a ‘negotiation’ between the bibliographic database and the journal what counts as a ‘research item’, sending some chocolate may help, but you could make the case that a translation is not original research and should not be counted as a ‘research item’, while their citations do count for the journal. 

Many journals in non-English languages already have English abstracts. This should be encouraged. 

Databases

Makers of bibliographic indices and databases, such as the Web of Science, as well as university rankings, could stimulate the production of translations by including the citations of the translation in the citation count of the original. If the translation is in an indexed journal, this would not increase the citation total for a research, but it could increase their h-index

Publication ethics

In this age of publish-or-perish, sometimes authors try to sell a translation as an original study and do not clearly mark the document as a translation. This has given translations a bad reputation in some quarters, which should be avoided. The Committee on Publication Ethics (COPE) advises scholarly journals and mentions in their guidance of redundant publication that translations should not be seen as duplicates, which would have been an ethics problem, but as valid publications. COPE notes that such secondary or derivative publications should be marked as translation by referencing the original version. COPE writes:

ICMJE [International Committee of Medical Journal Editors] advises that translations are acceptable but MUST reference the original.

Even if the translated article cites the original, this is not apparent in a publication list. It is thus good practice to mention in the title itself that the work is a translation. The ICMJE recommends: 

The title of the secondary publication should indicate that it is a secondary publication (complete or abridged republication or translation) of a primary publication.

Spread the word

So whether you are author, editor, peer reviewer or hiring scientists, please see translations as research output and open science by producing more translations to speed up progress in knowledge dissemination. 

* The photo at the top is the Biblioteca Vasconcelos, the Megabiblioteca (“megalibrary”) in the downtown area of Mexico City.

Screenshot of a mock-up of the Translation Science Switchboard.

You know an article exists, but cannot read its language. So you go to our tool, paste the Digital Object Identifier of the article and get a list with translated versions. You manage your articles in a reference manager and notice that an article on your reading list is now also available in your mother tongue. You are really enthusiastic about a new article that was just published, which has policy implications for your country and you want to translate it so that more people can read it, on our tool you find a partial translation made by a colleague from another university department; you jointly finish the translation, publish it on a repository and upload the link to our database.

These scenarios demonstrate that a translation finding tool would be really useful and could also stimulate the production of translations.

One of us started dreaming of such a tool attending a climate conference in Peru. Colleagues from the local weather service were doing interesting work, but many did not speak much English. An important way they kept up to date were the guidance reports written by the World Meteorological Organization (WMO), one of the oldest open science organizations. They translate all their guidance reports into many languages because the weather services who control the WMO see this as a priority. A colleague at the conference told me that she sometimes translates important English articles into Spanish and emails them to her colleagues; just like Albert Einstein translated important studies into English. That made me wonder whether we could spread translations in a better way and thus also stimulate their production.

Lingua Franca

Translations are part of the open science movement. Translated scientific articles make science more accessible to regular people, science enthusiasts, activists, advisors, trainers, consultants, architects, doctors, journalists, planners, administrators, technicians and many scientists.

English as a common language has mostly made global communication within science easier. However, this has made communication with non-English communities harder. This goes both ways, people who could benefit from scientific knowledge and people who have knowledge scientists should know.

For English-speakers it is easy to overestimate how many people speak English because we mostly deal with foreigners who do speak English. However, it is thought that that only about one billion people speak English. That means that seven billion people do not.

Translated scientific articles speed up scientific progress by tapping into more knowledge and avoiding double work. They thus improve the quality of science. The additional two-way knowledge transfer aids innovation and tackling the big global challenges in the fields of climate change, agriculture and health. Translations can improve public disclosure, scientific engagement and science literacy.

Phone screenshot of the Translate Science Switchboard.

Open Source tool

We want to develop and deploy an open source tool to make it easier to find translations and thus make them more worthwhile to make. In its simplest form people should be able to search using a Digital Object Identifier, a title or the names of the authors and be presented with a list with links to translations. People or organizations who made or have translations should be able to upload lists with links. Users and volunteers should be given moderation tools.

Also searching by topic and a topic directory would be useful as translated articles tend to be the more important ones in a field. The database should also be accessible via an Application Programming Interface (API) so that other tools and webpages can automatically display information on any translations and inform us about new translations.

People or organizations who made or have translations should be able to upload lists with links. There were similar databases during the Cold War to keep up with Soviet research and we want to try to rescue their datasets and upload them to our database. Many research libraries, international organizations and research institutes (World Meteorological Organization, UK Met Office, …) have translated articles and reports, which should be included. Also translations of articles written before English was the Lingua Franca.

The expensive organizations maintaining these databases and making translations collapsed after the Cold War. In the internet age, we can maintain large knowledge bases cost effectively with global volunteers, as Wikipedia has demonstrated, and include many more languages. Also translating has become much easier as a reasonable first draft can often be provided by machine learning. And we can now network people who only occasionally make translations (of their own articles).

Not every contribution will be perfect. Users and editors of such a database should be be able to indicate how good a translation is and need moderation tools. With versioning it should be easy to revert vandalism or spamming. We could green lists known scientific repositories and red lists known spammers.

If there are multiple translations for a language, editors or users should be able to rank them and indicate which one is best. If only because external systems using our information may be designed to only accept one translation per language as that will be the most typical case.

A “talk page”, similar to Wikipedia’s, could be useful to allow users to point to problems, discuss which translations are best and which quality flags need to be set. Possibly even to organize to jointly make a better translation. This could be implemented with a commenting or forum system in a background tab.

Copying the idea of Wikipedia of making a page with recent changes can help with quality control. Such a page can be filtered in several ways, e.g., for contributions by new people. In case someone finds that a participant made a problematic contribution a look at their user pages may find more problems.

Many more technical details of how such a system could look can be found on our Wiki.

Mockup of the search page showing all articles on the search term for which the database has translations.

Points to ponder

How hard would it be to make the system distributed, to have multiple servers who talk to each other and exchange data if they trust each other? We are doing this for science, but there are groups outside of science who could use similar system; the nearest to us would be education and science communication. (Disciplinary) groups within science may be able to use their networks to promote the production of translations. That would make bulk download of our data a good idea to get a new server started.

It could be worthwhile to make a (private) backup of the known translations and regularly check for broken links. The backup can help the editors find the new location of the translation or to upload it elsewhere if the license allows for this.

It may be a good idea to have multiple types of links to translations. Literal translations, but also related works in another language, for example a PhD thesis in language X and a corresponding article in language Y. Sometimes people may write a summary of an article in another language, which could be valuable if there is no full translation. Also links to partial translations can still be valuable and showing them could promote their completion.

The road ahead

The above mainly describes the technical aspects of such a Translations Switchboard, but there is also a human aspect. We will need a community of editors for every language to check submitted URLs to avoid spam and select the best version in case multiple ones are available. So we need tools to build and organize this community. We will also need publicity so that people know about the service. Part of the advertising could function via integration of our system in others. We will need volunteers who contact possible sources of translations which could be integrated into the database and to promote the production of translations in their circles.

Designing and coding the full system described above would be a considerable task. If someone has experience with similar projects and would like to apply for funding: feel free to make the idea your own; we are also happy to be the science advisory team. For now we decided to start small. Create a minimal system and add the data we know of to it. That way the idea becomes more concrete, which will hopefully help to find resources to build it and to fill the database. This first version will be coded using PHP, HTML, CSS and Maria DB.

You can already help us a lot by spreading this idea to increase the chance that people interested in contributing learn out it. Also feedback on the idea in the comments below is very valuable. If any of the above appeals to you, please get in touch on Mastodon or by email.

Mockup of a results page listing all translations we know of for a specific original article.

In case you might not yet have had a chance to read this previous blog post by my colleague please do so, it accurately addresses the well-known dilemma faced in the current scholarly publishing landscape in science.

About 2000 languages are spoken in Africa, and these traditional and indigenous dialects are also a medium of choice in knowledge dissemination for many scientists on and off the continent.

As pointed out in the earlier mentioned blog post, many African scientists are proficient in the English language and regularly publish their scholarly communications in Anglophone. In 2018 alone AfricArXiv preprint repository scholarly African collection had 25 submissions in English.

It is however not lost on such scholars, myself included, that whereas we are multilingual, we face unilingual constraints in expressing our mostly written publications as well as sometimes in our spoken word presentations.

I believe that technology in its role as an enabler of positive change could play a vital role in bridging this gap through the use of Artificial Intelligence (A.I.) offering a service of providing a seamless translation platform for scientific work written in different official African languages.

One of the key task for such an A.I. system could be accepting English-papers written by African researchers and offering a seamless translation service resulting in the output of as many African languages as possible, and vice versa, and in a manner that is structured to build on previous learning.

To quote my colleague in the previous blog post “With the advancement of Natural Language Processing (NLP), it should be fairly easy for non-Indonesian [or African] speakers to understand articles written in Indonesian [or African local dialects]. Hence the burden to immediately use English as the main language of science could be lowered.”

Translate Science is interested in the translation of the scholarly literature. Translate Science is an open volunteer group interested in improving the translation of the scientific literature. The group has come together to support work on tools, services and advocate for translating science.

The groups members have different background and motivations. Hydrogeologist Dasapta Irawan would like scientists to be able to write in the language of the people they serve. Ben Trettel works on the breakup of turbulent water jets and regrets that so much insight from the Russian turbulence literature is ignored. Victor Venema works on observed climate trends and needs information on (historical) measurement methods, which are kept in local languages; his field needs to understand climate impacts everywhere and quality data from all countries of the world. Luke Okelo, Johanssen Obanda and Jo Havemann are working with AfricArxiv – the community-led Open Access portal to promote African research output. They are interested in seeing scientific literature in African languages transcend traditional scholarly publishing barriers that indigenous languages come up against and will soon launch a collaborative effort to translate African scholarly manuscripts into various African languages.

For the group the term “scientific literature” has a wide spectrum of forms and can mean anything from articles, reports and books, to abstracts, titles, keywords and terms. Summaries in other languages are also helpful.

We are interested in a range of activities to help translations: providing information, networking, designing and building tools and lobbying for seeing translations as valuable research output.

We have this blog, our Wiki, our distribution list and a micro-blogging account for discussions on what we can do to promote translations and to provide information on how to make translations and find already existing ones.

Various tools (and communities using them) could help finding and producing translations. A database with translated articles could make them more discoverable. This database should be filled by people and institutions who made translations, as well as with precursor databases and articles from translation journals (from the Cold War era). With appropriate interfaces (APIs) reference managers, journal and preprint repositories and peer review systems could automatically indicate that translations are available. Such a database could also help build datasets that can be used to train machine learning method for the translation of digitally small languages.

There are great tools for the collaborative translations of software interfaces. Similar tools for scientific articles would be even more helpful: translating an article well requires knowledge of two languages and the topic; this combination is easier to achieve with a group and together translating is more fun. Automatic translations could provide a first draft and save a lot of work.

If we could determine which articles are most valuable to be translated that may increase the incentives of (national) science foundations to fund their translation. With the use of the multilingual Wikidata knowledgebase we could improve searching the literature with multilingual tools, so that also relevant articles in other languages are found. In addition we could make text mining multilingual and non-native speakers could be presented with explanations in their mother tongue of difficult terms.

Rather than being appreciated, translations sometimes even lead to punishments. Google accidentally punishes people translating keywords because their software sees that as keyword spamming, while translated articles are often seen as plagiarism. We need to talk about such problems and change such tools and rules so that scientists translating their articles are instead rewarded.

English as a common language has made global communication within science easier. However, this has made communication with non-English communities harder. For English-speakers it is easy to overestimate how many people speak English because we mostly deal with foreigners who do speak English. It is thought that that about one billion people speak English. That means that seven billion people do not. For example, at many weather services in the Global South only few people master English, but they use the translated guidance reports of the World Meteorological Organization (WMO) a lot. For the WMO, as a membership organization of the weather services, where every weather service has one vote, translating all its guidance reports into many languages is a priority.

Non-English or multilingual speakers, in both African (and non-African) continents, could participate in science on an equal footing by having a reliable system where scientific work written in non-English language is accepted and translated into English (or any other language) and vice versa. Language barriers should not waste scientific talent.

Translated scientific articles open science to regular people, science enthusiasts, activists, advisors, trainers, consultants, architects, doctors, journalists, planners, administrators, technicians and scientists. Such a lower barrier to participating in science is especially important on topics such as climate change, environment, agriculture and health. The easier knowledge transfer goes both ways: people benefiting from scientific knowledge and people having knowledge scientists should know. Translations thus help both science and society. They aid innovation and tackling the big global challenges in the fields of climate change, agriculture and health.

Translated scientific articles speed up scientific progress by tapping into more knowledge and avoiding double work. They thus improve the quality and efficiency of science. Translations can improve public disclosure, scientific engagement and science literacy. The production of translated scientific articles also creates a training dataset to improve automatic translations, which for most languages is still lacking.

As you have read this far you are probably interested in translations and science. Do join us. Write us any time: we have 2-weekly calls and a mailing list. Leave a comment below. Add your knowledge and ideas to our Wiki. Write a blog post to start a discussion. Join us on social media or add this blog to your RSS reader. Spread the message that Translate Science exists to anyone who may be interested as well. …

There is a language bias in the current global scientific landscape that leaves non-English speakers at a disadvantage and prevents them from actively participating in the scientific process both as scientists and citizens. Science’s language bias extends beyond words printed in elite English-only journals.

https://www.frontiersin.org/articles/10.3389/fcomm.2020.00031/full

Hello all.

It’s Dasapta from Indonesia. Thank you Victor for inviting me to joining The Translate Science Initiative. Although scientists are coming from every corner of the earth, living perfectly using their own native/mother tongue, but it’s English which has been used as the lingua franca of science.

Conversely, many scientists in Africa, Asia, Latin America and Europe still publish their work in national journals, often in their mother tongue, which creates the risk that worthwhile insights and results might be ignored, simply because they are not readily accessible to the international scientific community. To overcome this dilemma, several initiatives now aim to strengthen the impact and quality of national journals with the goal of gaining greater international visibility for articles published in a language other than English.

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1796769/

Born and raised in Indonesia, a non-English speaking country, it’s important for me to promote the use of national language (Indonesian) instead of English in scholarly communications, because:

  • Most researches in Indonesia are about local problems. Therefore it’s very logical if the main mode of dissemination should be in Indonesian.
  • Although many Indonesians would take English course since kindergarten or primary schools, but English still is not used as the first language. Therefore it takes more time and effort to translate our researches to English, while it could be shared faster if we used Indonesian.
  • With the advancement of Natural Language Processing (NLP), it should be fairly easy for non-Indonesian speakers to understand articles written in Indonesian. Hence the burden to immediately use English as the main language of science could be lowered.