German. English.

Translation and Artificial Intelligence

by Henry Whittlesey Schroeder

Introduction

1. Context

2. The stakeholders

2.1 The client or customer

2.2 The language service provider (LSP)

2.3 The translator

3. Artificial intelligence and accuracy

3.1 Strengths

3.1.1 Reliability and knowing your translator

3.1.2 Higher quality, passive knowledge, two “people”

3.2 Weaknesses

3.2.1 Style

3.2.2 Harmonization

3.2.3 Terminology

3.2.4 Interpretation of abstractions and references

4. Model for the future – translation buyers, language service providers and translators

Introduction

The first draft of a translation will and should be produced by a neural, machine-trained algorithm to achieve the greatest possible accuracy.

As a translator working primarily on German->English translations for the last twenty years, the next phase in the evolution of translation is already plainly clear.

In the following article, I will examine various factors to be considered in AI-produced translations, especially the perspectives of the customer, language service provider, and translator. This will be followed by a general discussion of the strengths and weaknesses of AI translation. The article will close with an outline of the future model for language services.

The primary aim of this article is to increase transparency and understanding among the various stakeholders in the translation process (customer, language service provider, translator). It will summarize the most important points about AI translation: its strengths and weaknesses as well as the main concerns for buyers, brokers and producers of translations. Each point will be studied at greater length in subsequent articles. This summary is intended to give you the big picture.

1. Context

Today, at essentially the click of a button, we can easily read and understand texts in a foreign language we do not know. This development is far more significant than the early versions of machine translation that began appearing in the aughts (e.g., Google Translate). These new neural translations, unlike their predecessors, produce impeccable sentences, flawless grammar and accurately convey the meaning of even complicated texts well north of 95% of the time. These neural translations are better than most human translations I have seen over my twenty years of intermittently revising translations for language service providers. Any translator with less than ten years of experience – most translators working at language service providers (LSPs) with in-house translators – is inferior to a machine.

The incredible achievement of neural machine translation also comes at essentially no cost. By contrast, the translations from most LSPs incur an expense of more than 0.20 EUR/word and generally around 0.30 EUR/word (for German->English translations in Germany). Consequently, it is no surprise that the hype surrounding chatgpt, on the heels of five years of neural translation offered by Google, Deepl and others, coupled with high inflation, falling demand in many sectors due to the corona aftermath, war in Ukraine, supply chain challenges and rising cost pressures, has caused companies requiring texts in a foreign language to choose AI over human translation.

While translations provided by LSPs seem to be unaffordable to potential translation buyers, these LSPs are facing an almost impossible situation. This is especially true for larger ones.

On the one hand, demand for translation was already falling before the corona crisis because of the quantum leap achieved by Deepl in 2017; on the other, the cost structure of an LSP was proving to be inflexible.

The gradual decline in demand for translation became an avalanche during the onset of the corona crisis. There was some recovery in 2021 and 2022, but the emergence of the generative artificial intelligence service called chatgpt in the autumn of 2022 and the previously unprecedented media attention it garnered drove a decline in demand for translation that mirrored the hard lockdown during March and April of the corona crisis. Even if LSPs outsource a fair amount of the actual translation work (there are different models), they still had the fixed costs of what was a low-margin business before these innovations, and, at least in Germany, the statutory employee protection laws made cutting costs through layoffs expensive. Companies with in-house translators or at least native speakers of target languages could use the neural translation service offered by Deepl and postedit or revise machine-produced translations, but they encountered another conundrum: machine-produced translations do not solve the most problematic issue in the field of translation, namely catching the interpretive and minor mistakes. And it is precisely these errors that inexperienced translators or project managers cum translators overlook in the postedit or revision.

The neural and AI translations produced phenomenal results, but they still had to go through several rounds of revision. In fact, the machine-generated text only achieved savings in the first of four or five rounds of text processing. If a translator in the pre-AI days manually translated a text, then revised it once (comparing translation to original), then revised it a second time, and finally proofread it (only copyediting the target language text), they saved time on solely the first of these four steps with neural machine translation. Depending on how much effort went into the first round, what is called postediting, especially stylistic adjustments (which are very necessary), the entire process “only” became 20 to 30% more efficient. Round 1 was faster. But rounds 2 through 4 did not change. In particular, revising the translation twice is time-consuming. Many translators, including myself, did not initially make much of an adjustment and also did not object when language service providers stipulated that machines not be used. Artificial intelligence obviously produced some translations that a human translator might not think of, but the human alternative was not much worse if they had sufficient experience (more than ten years) as I did.

The problem for LSPs using in-house staff is that reassigning a translation manager to translation or using inexperienced staff for translation in order to save costs led to a decline in quality. Translation buyers were paying the manual translation price and receiving translations run through machines and postedited, but still containing errors because the inexperienced staff did not have the skills to catch the errors. It was the same problem that plagued translation agencies after the widespread introduction of DIN EN 9001, which stipulated that a reviser check a translation prepared by another translator (four-eye or double-checking principle). Revisers then as now failed to pick up the misinterpretations and minor errors. The other option for LSPs to save on costs was to cut prices for translations, which meant that the external translators began to skip steps or try to obfuscate the use of machine-based translation.

In either case, the translation buyers began to notice the drop in quality or the persistence of the same problems, but there was a new alternative that cost essentially nothing: neural translation and now generative artificial intelligence services.

Let’s look at each of these stakeholders a little more closely.

2. The stakeholders

There are three primary participants in most language service transactions. They consist of the client or customer, the language service provider (LSP) or translation agency, and the translator. The LSP acts as a liaison or intermediary between the client and the translator. Sales representatives at the LSP generally interact with the client, while the so-called project managers assign the work to be translated to and communicate with the translators who are either at the LSP (in-house) or external (freelance) translators.

Similar to the actors in any business transaction involving a delivered product, all of these parties want the same result: a high-quality deliverable that can be used without further post-performance improvement or correction to preserve the efficiency of their operations.

2.1 The client or customer

In the German-speaking zone (Germany, Switzerland, Austria), the top priority for the vast majority of clients is that they receive a translation free of objective errors. They may ask for British or American or International Standard English; they might order “a translation that does not read like an original”; they can request a translator with credentials in the field – I’ve seen it all and worked on many a project where my unqualified status would rule me out in the client’s eyes, but the leading language service providers and I both know that the most important thing is the number of objective errors in the translation. Every text in Germany will be counter-read, as they say, meaning checked sentence for sentence, by an administrator in the client’s organization and, apart from a possible style guide with instructions on specific topics like orthography, almost all complaints arise from objective errors like a misread or misinterpreted word.

Clients want perfection. Perfection is defined as a translation without objective errors. An objective error is grammatical, a misinterpretation of the original, a misreading or a typo.

The short careers of translators, the prices charged by many independent translators (in part visible at proz.com), my own experience of attempting to outsource work here and there, the regular concealment of translators’ track records (at proz.com) and translation memories I have seen from working on projects involving other translators – each of these observations strongly suggests that the quality of translations, specifically the accuracy of the translation relative to the original, and the absence of objective errors is a massive problem in the industry. The introduction of DIN EN 9001 did almost nothing to change this and must have caused enormous headaches for LSPs and immense frustration among clients unable to understand how two people could make so many mistakes.

DIN EN 9001 is essentially the four-eye principle or double-checking, but the problem was and is that the revisers would compare the original and translation in one round of review at best and never identify all the objective errors. I ran tests at one point (about 20 cases) and the failure rate was 100% (planted errors): No translator found all the mistakes. In rare cases, I have seen revisions of my own translations where text was simply reworded, but spelling mistakes were made or words left out, making the revised text (objectively) worse than the original translation.

This problem probably explains both my entire career and the very short life of most independent translators and language service providers. Over my twenty years, few smaller companies have survived as long. The author of this article ranks in the top 1% of translators at the leading platform for translation services (proz.com) for verifiable length of service.

With the advent of generative artificial intelligence, however, this has become more complicated.

Clients can now get very accurate translations straight from a machine at no cost. Furthermore, in Germany, every administrator ordering translations has personally, or in the department needing translated texts, the staff capable of searching for errors. This has the potential to make LSPs redundant. In addition, a client’s employees are more familiar with the subject matter of their texts than external translators who spend only a fraction of their time working in the client’s specific (micro) field. Since the revision process in DIN EN 9001 is analogous to postediting machine translations, clients have no reason to believe that LSPs will be able to deliver flawless translations if they failed to find the mistakes prior to AI.

2.2 The language service provider (LSP)

Language service providers want the most affordable translations with the least hassle (from the translator and the client).

When work is outsourced to translators like the author of this article, the LSP either forwards the translation without more than a cursory check of formatting (sometimes even without that) to the client or to a reviser. If a reviser is involved, their work is usually done with “track changes” so that the quality of the original translation and the actual work of the reviser can be tracked. For reasons of efficiency, these changes are generally adopted wholesale, meaning, in their entirety, almost always without the original translator being consulted (especially if the original translator is very experienced and/or the revisions are minor or cosmetic).

In either case, the translation sent to the client should be the end of the process, with the exception of the bill.

At least in the case of German->English translations, there is a lot of potential room for discussion because basically every professional administrator in German companies, especially ones dealing with translation, is well versed in English. I do not have extensive knowledge of how this is handled. Over the years, I have gleaned some insight, although practices vary widely. Minor questions are presumably answered by in-house staff. Larger agencies will have some internal language specialists to prepare the translations and only outsource excess or more difficult work. Many smaller agencies will send all their orders to freelance translators or, less frequently, other agencies. I believe that both try to handle most small requests either internally or by charging the client for this service as well (and thereby ending the discussion immediately). External translators like myself are rarely bothered with minor issues, but it certainly happens.

To limit the potential for a disruption in its processes, the LSPs naturally try to find the best translator for a given translation. This is often handled by testing translators before working with them (a terrible practice that causes agencies to actually have an inferior pool of translators in the end because good translators generally have enough work and just postpone test translations indefinitely or refuse) and by matching the subject matter of the text to a given translator’s field of expertise (a very intelligent practice), which is often checked later by having a specialist in the field review the translator’s target language knowledge. In the case of English, some agencies match up the native language English with the client’s wishes.

2.3 The translator

Similar to the language service provider, a professional translator wants as little follow-up work as possible after delivering a translation. A translator should know their daily capacity and, at least in my case, usually has work lined up or coming in on a regular basis (e.g., short jobs sandwiched between others). Deadlines are generally tight, and a translator does not have the time to return to a previously delivered translation, refamiliarizing themselves with the context, decisions and logic, let alone the issues raised by the client.

The system for a single person to come as close as possible to achieving an accurate translation with a reasonable expenditure of time was this:

Step 1: Translation
Step 2: Revision (check of translation against original line by line)
Step 3: Repetition of step 2
Step 4: Proofreading of target text (copyedit of solely the target text)

Accuracy may be the top priority, but even this approach did not guarantee perfection because of the translator’s tendency to “read over” their own mistakes or repeatedly misinterpret a sentence. The emphasis on accuracy is all the more clear in my case because ~90% of my translations are prepared into British English, although I am a US American. Naturally, I have acquired a good working knowledge of British English over the years and am familiar with many idiosyncrasies (such as “in future” rather than “in the future”), not just the orthographic adjustments. Nonetheless, almost any native British speaker will be able to quickly say that I am probably an American, just as I can easily say that when I see a long enough text translated by a native British speaker into American English.

For the most part, the issue of British or American English is almost totally irrelevant in German-speaking countries because commercial or legal texts are not intended for a British or American audience, but rather for non-German speakers in Germany who need to understand something there. English is the lingua franca of the modern world.

So since clients want accuracy, LSPs disregard most other incompatibilities with translators.

3. Artificial intelligence and accuracy

The beginning of 2023 was an amusing time in some respects. As demand fell across the board, I decided to shift around the arrangements I had with some of my longest standing outsourcers (well over 10 years) and accept revision work. In the mid-teens, I completely discontinued revision work for all language service providers, as far too much of a burden was placed on the reviser. Texts were always riddled with all sorts of errors – interpretive, grammatical, poor wording, unfamiliarity with field (this was and remains absurd in the field of finance and economics), misreadings and so on. Moreover, a reviser had no idea who was preparing the translation and what to expect. If it were always the same translator, you would at least become familiar with the type of mistakes they make or the quality of their work. However, as an outsider, you never knew who the translator was, and they tended to change quickly, as I pointed out above. In addition to the quality issue, you only read a text once, so, with the exception of texts seen over and over again (like contracts), you had to work your way into the theme, style and idiosyncrasies of the author and translator on the fly and get it all right in one round. There was also a pricing problem in that you had to compare the translation to the original, which was the set price, but to ensure the quality of the result you had to do a pure target language proofread after the revision (not included in the price). Finally, if the client complained, you as the reviser would not get paid. This was all too much. So translation revision as a service was no longer offered here.

At the beginning of 2023 I began to think that this must have changed after 5+ years of AI translation. In fact, as we will get to in the last section of this article, I started contemplating an entirely different pricing model based on rounds. What I needed to know was whether it was realistic to revise now and whether an additional ten years of experience had made me better at it. Since I had the time, I decided to jump in.

Despite LSPs’ repeated claims that the translations I was revising were human translations (rather than machine ones), I didn’t even need to run them through a check to tell that they were not human translations. Technical features like the position of tags and the consistent appearance of translations of highly problematic terms as preferred by DeepL, the leading AI translation service provider, hinted at this. There was nearly no chance that multiple translators would decide consistently on the same translation. Another was that machine translations tend to insert apostrophes and quotation marks without the curl common in English. A careless, superficial revision fails to rectify such flaws.

But the biggest give-away was simply the incredible decline in mistakes and blatant misinterpretations. On the sentence level, the translation is – relative to human translations – nearly always correct. The AI rendering of a long sentence never omits a clause; a dative or genitive construction in a sentence is never overseen. A date or time reference never missing. The number of errors fell by 90%. On the single sentence level, the translation does not even have to be attentively read by the translator/posteditor and it will still be correct or sufficiently correct (to be defensible) most of the time.

Only in cases where it was almost categorically impossible for a machine to translate the phrase or formulation correctly (e.g., where a term is so obscure that you must look it up and translate as a paraphrase or in expository text) or the translation was prepopulated with a so-called fuzzy match from a TM will the translated segments still be riddled with errors. Cases I have recently seen include translations partially pretranslated or prepopulated with a translation memory (TM) that have resulted in a few numbers being incorrectly entered (e.g., 2022/23 ends up as 2021/22 in the target segment for an unknown reason), or a sentence in an annual report where the only word that changed from the previous year was “increase,” which became “decrease” in German, but was not corrected in English. The translator failed (despite, in the first case, an excellent QA check in the CAT tool Memoq) to catch these mistakes in the prepopulated text similar to the way translations were produced before artificial intelligence.

On the whole, however, revision had become a lot easier. You are still an outsider in the processing of each text. You still have only one reading of the original to get everything right. But at least you can rely on fundamental accuracy in non-TM-based segments. Despite what the LSPs have been telling me, I can quickly see that each translation has been prepared by a machine, generally DeepL. Moreover, it is as if you know your translator, everything about their approach and quality and where the mistakes are likely to be. Extra attention must be paid to fuzzy matches from TMs, to inter-segment harmonization, terminology, etc., but not to whether every phrase has been included or a negation in the original has become an affirmative statement in the translation.

3.1 Strengths

3.1.1 Reliability and knowing your translator

The reliability of machine translation is one of its many strengths. You can be 99% sure that relevant text has not been omitted or phrases and words illogically misread.

One of the many common mistakes human translators make, especially in the context of long German sentences, is that a small, possibly less important fragment, say a short dative or temporal clause, of the original is overseen. This almost never happens with algorithmic translation.

Furthermore, the likelihood of mistakes in long, complex sentences is much higher with human translations, as all the material cannot possibly be held in your head, so you have to break up the sentence into pieces, look up obscure words, circle back and figure out where you are in the sentence and so on. Again, neural translations are effectively flawless in this regard as well.

Algorithms also always read original words in at least one of their possible meanings. This eliminates the misreading of a word. For example, the words Markt (market) and Marke (brand) are very similar in German. Out of context or in a strange context, it is possible that Markt is translated entirely incorrectly as Marke. This never happens with machine-based translation unless a single term can be translated divergently based on context, and the context is misinterpreted (see section 3.2).

Aside from the (virtual) elimination of these critical flaws, neural machine translation offers a host of additional advantages. Simple sentences without abstractions or references to previous or subsequent text fundamentally require no reworking. This saves brain capacity for more complex passages. It also allows you to focus more on the place of the segment or sentence in the overall context.

In moderately complex or quite complex sentences, the assumption that all the sentence fragments, i.e., clauses, have been incorporated into the translation means that the reworking of the sentence can be done with the confidence that all the substance has been conveyed. It just needs to be repositioned (see section 3.2).

As I touched on above, you also know what kind of translation you are getting – i.e., high quality. Unlike the volatile results of translation prior to AI, a posteditor knows what is likely to be correct and what they have to pay attention to. This reduces the variables and, consequently, the workload. You concentrate on the specific weaknesses in AI (see section 3.2).

3.1.2 Higher quality, passive knowledge, two “people”

Yet of all the strengths, obviously, the greatest of all is an accurate translation a human translator would almost certainly never have thought of under the time constraints of modern efficiency requirements (just to make a living). Even the best of translators with the most noble intentions do not have the time to ponder the translation of a sentence for hours. Nor do they have endless resources for research. You are paid by the word and not paid much at that. You know the translation has to be correct, but whether it is the most creative translation or simply a good one, is all that time allows (in particular also because it is more important to allocate mental energy to preventing mistakes than maximizing creativity). You are not paid to exhaustively research wording or contemplate analogous meanings phrased entirely differently in the target language. Algorithms drawing on an infinite database can do that in seconds, coming up with very creative options a translator can even choose from.

The icing on the cake is when the machine produces or phrases a translation in a way that is beyond the scope of your active knowledge. It is like active and passive vocabulary. The one is smaller than the other. Even without the time constraints imposed by modern commerce, human translators (without the help of machines) are expected to conjure up excellent and accurate translations for 2,500 words a day at least five days a week for years on end. That is difficult even for a genius. But also with time, which is non-existent, exceptional knowledge of your fields and staggering linguistic skills, every translator has more passive than active knowledge.

When you look at a machine translation of larger texts with more complex sentences, you repeatedly encounter words, phrases and formulations that are simple outside of your active knowledge. You can tell they are right or accurate, but would not think of them yourself, no matter how much time you have. This is a truly fun and incredible experience. For people like me who translated manually for 15+ years, largely without external input and only with exposure to other translations in reference texts and translation memories (TMs), this experience with neural translations has been eye-opening. It’s like a near-sighted person seeing a world you know quite well with new glasses.

Suddenly – at a lower cost, no less – two “people” with immense expertise are preparing a translation. The algorithm starts by pretranslating each sentence to the best of its ability. The translator or posteditor follows the machine by checking each sentence and improving whatever is wrong, bad, in need of harmonization, etc. If something is not right, the post-machine processor (human) either corrects it based on knowledge or tricks of the trade or they consult generative AI for other interpretations that seem more likely to align with the original. The human and machine interact to produce nearly flawless results – at a fraction of the cost.

3.2 Weaknesses

AI translations still make a large number of mistakes. Some of these are technical in nature and very easy to pick up. Others are more difficult and require know-how.

Most translations are still processed with a CAT tool. The postediting is done in this tool, which basically breaks up the text into sentences, called segments, generally with the original language on the left side and the translation on the right. Periods and colons are used by the tool to determine the end of a sentence. Automated segmentation on this basis causes problems when an abbreviation with periods appears in the original or when a colon introduces a long list of possibilities for continuing a sentence (e.g., the event organizer is responsible for: 1. xxx 2. yyy 3. zzz). It is common for the machine translation to get confused in both of these cases. These errors are easily caught in the postediting process.

Another problematic issue besides the four separate ones discussed below is specific to the German language. Germans often use the present tense in texts with a future meaning. There are not two present tenses in German (as in English and Russian, for example) and there is a future tense for the future, but, for whatever reason (presumably the repetition of werden (will/shall)), German authors often employ the present for the future. This does not work in English for the most part. And algorithms have difficulty recognizing this. It should also be noted that Germans also use the present tense for events in the past, especially in history, to make it more immediate or emphatically imply the parallels between the recounted history and its interpretation on the one hand and current events on the other. AI translations (and also human translators) often shift this to the past. This point is controversial and can be debated. Both are definitely possible. But the present tense for future events is not common in English. It must generally be corrected.

The four most prominent weaknesses in AI translations are style, harmonization, terminology, and the correct interpretation of abstractions, with the fourth being the most problematic.

3.2.1 Style

The biggest weakness in machine-based translation is the style, which, at least as produced by DeepL, is somewhat stilted from an American perspective. It should be noted that even when DeepL is set to American English, the formulations and style are influenced by British English on balance. British English also has more of a clausal structure than thoroughly streamlined American English, so I may not be the best judge of the results produced by it, but I have found that moderately complex to complex sentences require a lot of rewriting. This is the most time-consuming aspect of working with machine translation. It is the reason why, as I wrote above, the price difference between a machine-based and human translation is “only” 20-30% (I am aware that this is a huge amount in economics and business administration). If you take an average sentence of 10-15 words (in German) and need to make 3 adjustments in the pre-translated English version, that is roughly equivalent to the amount of time needed to translate the sentence by yourself without machine support based on my estimates.

To achieve a fluid English text with at least a reduction in the clausal structure of German, which is common in translation to English, you must reconfigure the sentences produced by algorithms. Generative AI combines sentences, splits sentences and can even remove superfluous or redundant information, but at least the current iteration often shies away from the subjective decisions on sentence restructuring. My guess is that this is partially due to the fact that these sentences in and of themselves are not objectionable as clausal sentences, but when they appear in a series of multiple consecutive sentences, it becomes too cumbersome to read. Until future iterations of AI translations integrate this feel for language, posteditors will have to rewrite.

3.2.2 Harmonization

As in any language, German has one word for a concept that is translated divergently in other languages. Kunde, for example, equates to client or customer. Geschäftsjahr is conveyed as financial year (BE) or fiscal year (AE). The algorithms still have trouble harmonizing larger texts by using one word consistently. The same applies to capitalization, for example, in contracts. If the term client or customer is to be capitalized throughout a contract (e.g., “hereinafter referred to as (the) Client”), AI translations may capitalize the term for a while, but will eventually lose the thread.

In all likelihood you can (or will soon be able to) instruct generative AI to capitalize every iteration of a word. This will probably apply to most harmonization needs. For the time being, it must be done manually by a person (with search & replace). It is also necessary (and will be in the future) to review each instance of harmonization because, for example, the term customer/client might be used in two senses in a contract – for a party to the contract and for other customers (of the contractual customer/client or the other contracting party). This will always apply, even with instructive generative AI.

The postediting of machine translations also requires a certain know-how. A posteditor or translator/reviser must be aware of what to look for, especially if only one round of postediting is to be performed. Harmonization is an example of this, as is the occasional omission of a sentence fragment in long sentences (may be interpreted by the algorithm as superfluous), along with the points discussed below.

3.2.3 Terminology

Specialists or experts in a field easily find flaws in algorithmic terminology. This is the flip side of what was discussed under strengths. As mentioned above, the machines are better than humans with less than 10 years of experience, but they are not flawless in all areas of translation. Although I deal primarily with German-to-English translations, the issue is not one of British vs. American English, even if this does exacerbate the problem. Rather, it is more a matter of knowing standards, norms and other documentation establishing certain terms in a specific field for continental European use of English. Perhaps in the UK or United States, the AI-produced terminology is common, but the fundamental equivalent is not used in the European context. Bilanzstichtag in German is an example. Reporting date would be far more common in American English, but this term is almost always translated as balance sheet date, a formulation that sounds quite stilted in the United States. Another is Finanzamt – the equivalent of the Internal Revenue Service (IRS) in the United States. You would never translate Finanzamt in a German context as IRS, although a machine might. The algorithms know many of these things and are constantly improving. Nonetheless, it is still something that must be paid attention to.

Another problem is the translation of terms that require a judgement about the priority of the author is required. Gemeinde is a good example of this. It can be reproduced interchangeably as community or municipality in almost any context. However, if the author is evidently using Gemeinde as a type of politically defined location such as a city, town and Gemeinde, then the machine may not pick this up in an isolated sentence, but a translator processing the entire text and knowing the big picture can judge the appropriate translation as municipality.

3.2.4 Interpretation of abstractions and references

Misinterpretation of abstractions and references accounts for one of the more frequent errors made by machines. While I can only surmise what the reasons for this are, it is likely that the machines are guessing on the basis of surrounding context in some cases and, even more interestingly, adopting a position in regards to the original.

German authors like to create diversity in their texts by abstracting from the concrete case. A simple form of this is seen in the use of generic or umbrella terms preceding names or titles. For example, the descriptive term Bereich (area/segment/division, etc.) appears repeatedly for categories that would often simply be named in English with a description of their type. Here is an example:

Unseren Fokus richten wir auf das Zusammenwirken von Finanzierungs- und personellen Versorgungsengpässen bzw. -lücken in den Bereichen Rente, Pensionen, Steuereinnahmen, Pflege, Gesundheit, Bildung, öffentliche Sicherheit und Wohnen.

We have focused on the interplay between financing and staff shortages or gaps in benefits connected with the areas of retirement, pensions, tax income, nursing care, health, education, public safety and housing.

Whereas an American writer would simply say “connected with retirement, pensions, tax income…” without a template, Germans designate the type of category before the name or specification. The text cited above then follows this passage by referring to these Bereich (areas) as Systeme (systems) for whatever reason. In this case the translation is easily understandable. While less than ideal, in my opinion, the reference clear. The analogous terms in English can be used, with the reader understanding that retirement, pensions, tax income… are both areas and systems. Yet, such shifting of generic terms or the invention of more creative abstractions can cause the machine to confuse the reference and interpret the passage in relation to something else in the text. Humans do this as well, to be fair. It is a common issue with abstractions.

Working with machines also becomes interesting and funny when they seem to be correcting the original text. I recently encountered a text with these sentences:

Ich danke ihnen ganz herzlich dafür! Sie tragen so dazu bei, das Leben von Menschen mit sogenannter geistiger und mehrfacher Behinderung ins Bewusstsein zu rücken, Öffentlichkeit herzustellen und nachhaltig zu verbessern.

Machine translation: I thank them so much! You are helping to raise awareness, generate publicity and make a lasting difference in the lives of people with what are called intellectual and multiple disabilities.

The you (underlined) is incorrect – it should be they because it refers to them in the preceding sentence (these are organizations helping people with disabilities). It is thoroughly possible that this is an innocent mistake made by the machine. Because the second sentence begins with the pronoun referring to them, it is capitalized, which means it can be they or you in the absence of context. However, the context is readily apparent here from the preceding sentence. Neural or generative AI translation generally realizes this very easily. It uses surrounding context to make decisions on more complex issues than the alignment of these pronouns in simple sentences. My guess is that the AI felt the speaker should be addressing these organizations directly in the you form. It won’t propose something that is overtly false, like translating ihnen as you, but the moment it encounters ambiguity with Sie, it takes the opportunity to propose a better way of expressing the thanks here. Whether right or wrong, the posteditor must step in here and write they in case the line-by-line review by the customer raises questions about quality on the basis of pronoun reference mistakes such as this.

4. Model for the future – translation buyers, language service providers and translators

The dynamics of the future model will be fleshed out at length in a separate article, but I would like to conclude by pointing out a few central issues for the translation stakeholders here. The two most important ones are (i) responsibility for ensuring the correspondence between the original and the translation (ii) rounds of revision.

If translation buyers are assuming full responsibility or partial responsibility for the alignment of the translation with the original, for example, by performing a round of revision themselves, then translation agencies can assign a postedit or revision job to a translator (in-house or external) who carries out one round of revision (comparison) and a proofreading (copyedit of target text). The cost of this for the translation buyer in the German->English language pair should be possible at a rate of 0.09-0.12 EUR/word, under the assumption that ~33% goes to the translator, ~40-45% to the cost of revenue for the language service provider and ~22-27% to profits (language service providers are rarely, if ever, profitable in the long run, but the putative goal must be that). The quality should be reasonable, with the translation buyer only needing to make minor adjustments at most or inquiring in isolated instances if something is incorrect.

One round of revision is generally not sufficient, however, for a text that needs to be perfectly aligned with an original. So if the translation buyer does not or cannot review the work, the entire process becomes more fraught. As indicated above, a translator reviewing a machine translation may be very familiar with the field, but they still need to read an original at least twice in case they misread or misinterpreted something in the first read of the original.

We have all been on vacation to an unfamiliar place and missed many aspects of what we saw. The same applies to reading a novel or book. If you revisit a former place or re-peruse a book, you notice many more features, details, relations… The same applies to translation. When an external translator reads a text for the first time and compares it to the machine-produced translation, they may be familiar with the style of algorithmic translation, but they must still familiarize themselves with the original. Every writer has their own idiosyncrasies, with some being more complex to translate than others. A professional translator grasps the big picture easily. The flaws will lie in the details. As explained above, if the translation buyer also reviews the text with their virtually unlimited knowledge of their own organization and sector, they will catch flaws in the details. The external translator cannot do this. They need two rounds of revision (and then the pure proofreading/copyediting).

The problem with this approach is the expense. Abstractly, it does not seem like much of an issue – just pay a little more, pay for one more round of revision. When that one more round is marked up across the board (for operating costs and then profit) by the language service provider (and not necessarily passed on to the translator), the translation buyer again encounters rates of 0.20 EUR/word or more, which becomes very expensive. Two pages of fully written text costs EUR 200.00 although the translation produced for free by an algorithm is almost flawless.

Finding the isolated problems is time-consuming, but the price is unreasonable. There is no simple solution to this conundrum. The best and most affordable option is for the translation buyer to act as described above, i.e., handle part of the revision on their own, or to work directly with a dual-language native speaker of English.

Finally, I might add that this issue in translation is not isolated to this field. Generative AI has given rise to analogous problems in coding, copywriting, marketing and certainly many other fields. I can have generative AI produce code or write a very reasonable white paper. Buyers of these types of products are also confronting the complicated pricing of post-production improvement, the identification of mistakes made by algorithms. In coding, for example, it is often critical that a programmer think of aspects that were disregarded in the production of the code. Not altogether different from the remarks made on harmonization here. A certain know-how is required to produce the best code with the fewest holes. It is quickly clear whether a code works or not: You see something on the screen or you don’t, but useful features or functions may be missing. The same is now true for translation. You can have the text in understandable English from a machine, but need to find the parts where the English does not correspond to the original as it should.

The road to working with generative artificial intelligence will be a fascinating and complicated one. We should enjoy the journey before computers can do everything.

Future articles will examine these topics in more detail.

Published: June 2023

Translation and Artificial Intelligence

Language services

Contact details