Imagine discovering a hidden diary from a bygone era, filled with cryptic, nonsensical letters. Or consider the modern-day mystery of an anonymous author whose novel becomes a bestseller, sparking a worldwide guessing game about their true identity. These two scenarios, though centuries apart, are linked by a single, powerful idea: every system, whether a secret code or a human writer, has an unconscious, repeating rhythm. Finding that rhythm is the key to unlocking its secrets.
This is the story of the Kasiski examination, a 19th-century cryptographic breakthrough that cracked one of history’s most formidable ciphers. More than that, it’s the story of how the very same logic is used today to reveal the ghost in the machine—the author behind the anonymous text.
For centuries, the king of ciphers was the polyalphabetic cipher. Unlike a simple substitution cipher (like a Caesar cipher, where ‘A’ always becomes ‘D’), a polyalphabetic cipher uses multiple substitution alphabets. The most famous of these is the Vigenère cipher, invented in the 16th century but not widely used until the 19th. For over 200 years, it was lauded as le chiffrage indéchiffrable—the indecipherable cipher.
Its strength came from a simple keyword. Let’s say our keyword is LEMON
. To encrypt the message “ATTACK AT DAWN”, you’d write the keyword repeatedly above it:
Keyword:
LEMONLEMONLEMO
Plaintext:ATTACKATDAWN
The first ‘A’ in “ATTACK” is encrypted using the ‘L’ alphabet, the first ‘T’ is encrypted using the ‘E’ alphabet, the second ‘T’ using the ‘M’ alphabet, and so on. This means the two ‘T’s in “ATTACK” would become two completely different letters in the ciphertext. This complexity defeated the most common code-breaking tool of the era: frequency analysis. In English, ‘E’ is the most common letter, but in a Vigenère-encrypted text, its encrypted form would be scattered across different letters, leaving no statistical trace.
Enter Friedrich Kasiski, a Prussian infantry officer and cryptographer. In 1863, he published a book that detailed a devastatingly effective attack on the Vigenère cipher. He wasn’t the first to break it (Charles Babbage did so earlier but never published his work), but Kasiski was the one who revealed the method to the world.
His insight was deceptively simple: look for repeated sequences of letters in the ciphertext.
Why would repetitions occur in such a complex cipher? Kasiski realized it happens by chance when a repeated sequence in the original plaintext happens to align perfectly with the repeating keyword.
Consider this example:
THEENEMYWILLATTACKTHEEASTWALL
CODE
(length 4)When we align them, something interesting happens with the word “THE”:
Keyword:
CODECODECODECODECODECODECODECODE
Plaintext:THEENEMYWILLATTACKTHEEASTWALL
The first “THE” is encrypted using the keyword letters “COD”. The second “THE” is also, by pure chance, encrypted using the exact same “COD” sequence from the keyword. This means that the resulting three-letter sequence in the ciphertext will be identical in both places. A codebreaker scanning the garbled text would see a repeating pattern, a crack in the cipher’s armor.
Kasiski turned this observation into a methodical process:
Once you know the keyword is, say, 4 letters long, the “unbreakable” cipher collapses. You can split the ciphertext into four separate columns, each of which is just a simple Caesar cipher. From there, old-fashioned frequency analysis makes quick work of the rest.
So, what does a 19th-century military cipher have to do with linguistics and identifying anonymous authors? Everything. The underlying principle is exactly the same: unconscious, repeated patterns can betray a hidden system.
In the Vigenère cipher, the hidden system is the keyword. In writing, the hidden system is an author’s unique, ingrained stylistic habits—their “authorial fingerprint.”
This is the field of stylometry, the statistical analysis of literary style. Just as you have a unique fingerprint or signature, you have a unique “stylo” composed of countless unconscious choices you make when you write.
An author’s fingerprint isn’t about using big, fancy words. In fact, it’s often the opposite. The most reliable markers are the small, functional words and patterns we use without a second thought:
No single trait can identify an author. But when a computer analyzes dozens or hundreds of these features across a large body of text, it can build a remarkably accurate statistical model of their style.
The most famous modern application of these principles—a digital Kasiski examination—was the unmasking of Robert Galbraith, the author of the 2013 crime novel The Cuckoo’s Calling.
When speculation arose, researchers Patrick Juola and Peter Millican ran stylometric analyses. They converted the book into a set of statistical data, focusing on features like word length frequency and, most importantly, the frequency of the 100 most common words. They compared Galbraith’s “fingerprint” to those of other suspected authors and, of course, to J.K. Rowling.
The result was unequivocal. The patterns in The Cuckoo’s Calling were a near-perfect match for Rowling’s other work. The repeated, unconscious “keyword” of her writing style gave her away. The same techniques helped identify Joe Klein as the author of Primary Colors and even played a role in pinpointing Ted Kaczynski as the Unabomber by analyzing his manifesto.
Friedrich Kasiski’s goal was to break military codes by spotting patterns in chaos. He could never have imagined that 150 years later, computational linguists would be using his core logic to solve literary puzzles.
The Kasiski method reveals a profound truth about communication: we all operate on a hidden keyword. For the cryptographer, it was a word like LEMON
or CODE
. For each of us, it’s the sum of our linguistic habits, a personal rhythm that embeds itself in everything we write. In an age of digital text and powerful algorithms, true anonymity has become the real chiffrage indéchiffrable—the truly unbreakable code.
While speakers from Delhi and Lahore can converse with ease, their national languages, Hindi and…
How do you communicate when you can neither see nor hear? This post explores the…
Consider the classic riddle: "I saw a man on a hill with a telescope." This…
Forget sterile museum displays of emperors and epic battles. The true, unfiltered history of humanity…
Can a font choice really cost a company millions? From a single misplaced letter that…
Ever wonder why 'knight' has a 'k' or 'island' has an 's'? The answer isn't…
This website uses cookies.