Numbers Stations: North Korea Allegedly Activates No. 719 Expedition Agents via YouTube

grarpamp grarpamp at gmail.com
Mon Sep 21 01:27:59 PDT 2020


https://www.mattblaze.org/blog/neinnines/

18 September 2020
A Cryptologic Mystery
Did a broken random number generator in Cuba help expose a Russian
espionage network?

I picked up the new book Compromised last week and was intrigued to
discover that it may have shed some light on a small (and rather
esoteric) cryptologic and espionage mystery that I've been puzzling
over for about 15 years. Compromised is primarily a memoir of former
FBI counterintelligence agent Peter Strzok's investigation into
Russian operations in the lead up to the 2016 presidential election,
but this post is not a review of the book or concerned with that
aspect of it.

Early in the book, as an almost throwaway bit of background color,
Strzok discusses his work in Boston investigating the famous Russian
"illegals" espionage network from 2000 until their arrest (and
subsequent exchange with Russia) in 2010. "Illegals" are foreign
agents operating abroad under false identities and without official or
diplomatic cover. In this case, ten Russian illegals were living and
working in the US under false Canadian and American identities. (The
case inspired the recent TV series The Americans.)

Strzok was the case agent responsible for two of the suspects, Andrey
Bezrukov and Elena Vavilova (posing as a Canadian couple under the
aliases Donald Heathfield and Tracey Lee Ann Foley). The author
recounts watching from the street on Thursday evenings as Vavilova
received encrypted shortwave "numbers" transmissions in their
Cambridge, MA apartment.

Given that Bezrukov and Vaviloa were indeed, as the FBI suspected,
Russian spies, it's not surprising that they were sent messages from
headquarters using this method; numbers stations are part of
time-honored espionage tradecraft for communicating with covert
agents. But their capture may have illustrated how subtle errors can
cause these systems to fail badly in practice, even when the
cryptography itself is sound.

First, a bit of background. For at least the last sixty years,
encrypted shortwave radio transmissions have been a standard method
for sending messages to covert spies abroad. Shortwave radio has
several attractive properties here. It covers long distances; it's
possible for a single transmitter to get hemispheric or even global
coverage. Shortwave radio receivers, while less common than they once
were, are readily available commercially in almost every country and
are not usually suspicious or alerting to possess. And while it's
relatively easy to tell where a shortwave signal is coming from, their
wide coverage area makes it very difficult to infer exactly who or
where the intended recipients might be. Both the US (and its allies)
and the Soviet Union (and its satellites) made extensive use of
shortwave radio for communicating with spies during the cold war, and
enigmatic "numbers" transmissions aimed at spies continue to this day.

The encryption method of choice used by numbers stations is called a
"one time pad" (OTP) cipher. OTPs have unique advantages over other
encryption methods. Used properly, they are unconditionally secure; no
amount of computing power or ingenuity can "break" them without
knowledge of the secret key. Also, they are almost deceptively low
tech. It is possible to encrypt and decrypt OTP messages by hand with
nothing more than paper and pencil and simple arithmetic. The
disadvantage is that OTPs are cumbersome; you need a secret key as
long as all the messages you will ever send, with no part of the key
ever re-used for multiple messages. Typically, the key would be
printed as a series of digits bound into a pad of paper, with each
page removed after use; hence the name "one time pad". OTPs can be
difficult in practice to use properly and are quite vulnerable if used
improperly; more on that later.

The OTP messages sent to spies by shortwave radio typically consist of
decimal digits broadcast in either a mechanically recorded voice or in
morse code (more recently, digital transmissions are also used) on
designated frequencies at designated times, usually in four or five
digit groups (hence the term "numbers station"). After copying and
verifying a header in the message, the agent would remove the
corresponding page from their secret OTP codebook and add each key
digit to each corresponding message digit using modulo-10 arithmetic
(without carry). The resulting "plaintext" digits are then converted
to text with a simple substitution encoding (e.g, A=01, B=02, etc.,
although other encodings are generally used). That's all there is to
it. The security of the system depends entirely on the uniqueness and
secrecy of the OTP codebook pad given to each agent.

To prevent "traffic analysis" that might reveal to an observer the
number of active agents or the volume of messages sent to them,
numbers stations typically operate on rigidly fixed schedules, sending
messages at pre-determined times whether there is actually a message
to be sent or not. When there is no traffic for a given timeslot,
random dummy "fill" traffic is sent instead. The fill traffic should
be indistinguishable to an outsider from real messages, thereby
leaking nothing about how often or when the true messages are being
sent. But more on this later.

None of this is by itself news. The existence of numbers stations has
been publicly known (and tracked by hobbyists) since at least the
1960's, and OTPs are an elementary cryptographic technique known to
every cryptographer. However, Strzok mentions two interesting details
I'd not seen published previously and that may solve a mystery about
one of the most well known numbers stations heard in North America.

First, Compromised reveals that the FBI found that during at least
some of the time the illegals were under investigation, the Russian
numbers intended for them were sent not by a transmitter in Russia
(which might have difficulty being reliably received in the US), but
relayed by the Cuban shortwave numbers station. This is perhaps a bit
surprising, since the period in question (2000-2010) was well after
the Soviet Union, the historic protector of Cuba's government, had
ceased to exist.

The Cuban numbers station is somewhat legendary. It is a powerful
station, operated by Cuba's intelligence directorate but co-located
with Radio Habana's transmitters near Bauta, Cuba, and is easily
received with even very modest equipment throughout the US. While its
numbers transmissions have taken a variety of forms over the years,
during the early 2000's it operated around the clock, transmitting in
both voice and morse code. The station was (and remains) so powerful
and widely heard that radio hobbyists quickly derived its hourly
schedule. During this period, each scheduled hourly transmission
consisted of a preamble followed by three messages, each made up
entirely of a series of five digit groups (with by a brief period of
silence separating the three messages). The three hourly messages
would take a total of about 45 minutes, in either voice or morse code
depending on the scheduled time and frequency. Every hour, the same
thing, predictably right on schedule (with fill traffic presumably
substituted for the slots during which there was no actual message).

If you want to hear what this sounded like, here's a recording I made
on October 4, 2008 of one of the hourly voice transmissions, as
received (static and all) in my Philadelphia apartment:
www.mattblaze.org/private/17435khz-200810041700.mp3. The transmission
follows the standard Cuban numbers format of the time, starting with
an "Atenćion" preamble listing three five-digit identifiers for the
three messages that follow, and ending with "Final, Final". In this
recording, the first of the three messages (64202) starts at 3:00, the
second (65852) at 16:00, and the third (86321) at 29:00, with the
"Final" signoff at the end. The transmissions are, to my cryptographic
ear at least, both profoundly dull and yet also eerily riveting.

And this is where the mystery I've been wondering about comes in. In
2007, I noticed an odd anomaly: some messages completely lacked the
digit 9 ("nueve"). Most messages had, as they always did and as you'd
expect with OTP ciphertext, a uniform distribution of the digits 0-9.
But other messages, at random times, suddenly had no 9s at all. I
wasn't the only (or the first) person to notice this; apparently the
9s started disappearing from messages some time around 2005.

This is, to say the least, very odd. The way OTPs work should produce
a uniform distribution of all ten digits in the ciphertext. The odds
of an entire message lacking 9s (or any other digit) are
infinitesimal. And yet such messages were plainly being transmitted,
and fairly often at that. In fact, in the recording of the 2008
transmission linked to above, you will notice that while the second
and third messages use all ten digits, the first is completely devoid
of 9s.

I remember concluding that the most likely, if still rather
improbable, explanation was that the 9-less messages were dummy fill
traffic and that the random number generator used to create the
messages had a bug or developed a defect that prevented 9s from being
included. This would be, to say the least, a very serious error, since
it would allow a listener to easily distinguish fill traffic from real
traffic, completely negating the benefit of having fill traffic in the
first place. It would open the door to exactly the kind of traffic
analysis that the system was carefully engineered to thwart. The
9-less messages went on for almost ten years. (If I were reporting
this as an Internet vulnerability, I would dub it the "Nein Nines"
attack; please forgive the linguistic muddle). But I was resigned to
the likelihood that I would never know for sure.

And this brings us to the second observation from Strzok's book.

Compromised doesn't say anything about missing nueves, but he does
mention that the FBI exploited a serious tradecraft error on the part
of the sender: the FBI was able to tell when messages were and weren't
being sent during the weekly timeslot when the suspect couple was
observed in the room where they copied traffic. Even worse (for the
illegals), empty message slots correlated perfectly with times that
the suspect couple was traveling and not able to copy messages. This
observation helped confirm the FBI's suspicions and ultimately led to
their arrest and expulsion (along with the rest of the Russian
illegals network).

I suspect that Strzok simplified the story for the book and that it
was the 9-less Cuban messages that they surmised to be "no message"
traffic. The FBI (or NSA) no doubt noticed the lack of 9s just as I
(and others) did, and likely came to the same conclusions I did. The
difference is that they were in a position to confirm the hypothesis
through real-time surveillance of actual espionage suspects.

Ironically, this was not the first time that Russian/Soviet
intelligence has been burned by sloppy OTP practices. The first was,
more famously, the disastrous re-use of OTPs discovered and exploited
in the Venona intercepts.

One time pads can be a cryptographic landmine. They have a very
attractive property - provable security! - but at the cost of
unforgiving operational assumptions that can be hard to meet in
practice. OTPs have long been a favorite of hucksters selling
supposedly "unbreakable" encryption software. So remember this story
next time someone tries to sell you their super-secure
one-time-pad-based crypto scheme. If actual Russian spies can't use it
securely, chances are neither can you.

Anyway, as they say on the radio...

FINAL
FINAL


More information about the cypherpunks mailing list