What AI & Human Translators Simplify: Insights from an Entropy-Based Study

What do we know about the first recorded example of a written translation? Well, it dates back to around 4,500 years ago, when a Sumerian scribe carved a bilingual glossary into clay to help readers move between Sumerian & Akkadian. Ever since, people have questioned how “faithful” a translation can be to its source -though the very idea of faithfulness is itself debatable. With AI now producing translations at scale (& displacing many human translators along the way), we suddenly have new ways to examine how far a translation drifts from the original.

A recent study by Guangyuan Yao & Lingxi Fan (PLOS One, 2025) brings this ancient question up to date: we now have statistical tools like entropy -a measure of linguistic diversity- that let us see exactly how much complexity is lost or preserved in different kinds of translations by comparing Human Translation, Google Translate & ChatGPT across political, fiction & academic texts.

The study

The researchers built a multi-genre Chinese–English corpus containing:

original English texts
professional Human Translations
Google Translate outputs
ChatGPT-4 translations (prompted simply with “Translate into English”)

They analysed two forms of entropy:

Unigram entropy (lexical complexity): how diverse & evenly distributed the vocabulary is
POS entropy (Part of Speech or syntactic complexity): how varied the grammatical structures are

Because entropy is less sensitive to text length than measures like [1]TTR or mean sentence length, it’s well suited to comparing genres & translation modes.

The findings

A clear pattern emerged:

Original texts were the most lexically & syntactically complex.
Human Translations simplified the originals but preserved more structure than Machine Translation.
ChatGPT produced richer vocabulary than Google Translate but simpler syntax than Human Translation.
Google Translate showed the strongest simplification overall.

Key stats:

Lexical complexity (vocabulary variety): Originals & ChatGPT were almost identical at the top, with Human Translation slightly simpler & Google Translate the simplest of all.
Syntactic complexity (sentence structure variety): Originals were the most complex, followed by Human Translation, then ChatGPT, with Google Translate again the simplest.

Genre shaped the results too:

Political texts were lexically richest in their original form.
Fiction resisted syntactic simplification more than other genres.
Academic texts were consistently the simplest syntactically across all modes.

This echoes earlier work showing that creative genres often push translators to preserve nuance & stylistic texture. For example, translating a line like “She spoke in sentences that curled like smoke” demands choices that maintain rhythm & imagery -choices that resist flattening.

Translation will always be part of how we move between languages, but it’s built on choices -what to foreground, what to simplify, what to let go. In ELT, our learners make similar choices every day, & the translation tools they rely on influence the linguistic patterns they internalise.

Teacher Takeaways?

Expect simpler syntax from Machine Translation tools, especially Google Translate.
ChatGPT may offer more varied vocabulary, but still smooths out grammatical texture.
Genre matters: fiction tends to preserve complexity better than academic or political texts, so learners may see richer structures when translating or analysing narrative language compared to more informational genres.

Any bright ideas for how to help learners notice the differences between original texts & machine-generated ones?

[1] TTR (Type–Token Ratio) is a simple measure of lexical variety: how many different words (types) appear compared to the total number of words (tokens). High TTR = lots of variety; low TTR = more repetition.

Mean sentence length is just the average number of words per sentence, often used as a rough proxy for syntactic complexity.

Both are useful but can be misleading because they’re heavily affected by text length — which is why entropy offers a more stable alternative.

tl;dr-ELT

Translation, 4,500 years old and still trying to get it just right

Like this:

Leave a ReplyCancel reply

Welcome to my blog

Let’s connect

Recent posts

Translation, 4,500 years old and still trying to get it just right

Exhibit A: Why Teachers Still Matter

Meet the Brain: The Original LLM

AI Can Read the Room… But Can It Read the Learners?

Let them eat ‘some’ cake

Friction in language learning: it’s not a bug, it’s a feature

Share this:

Like this:

Leave a ReplyCancel reply

Welcome to my blog

Let’s connect

Recent posts

Discover more from tl;dr-ELT