Re: Steganography and Steganalysis
I have received some useful feedback to yesterday's message on steganography and steganalysis. Here are some clarifications to my cryptic presentation and a correction. I was most interested in finding if the steganographic capacity of English is high enough to make steganography practical for everyday use, so I didn't even address the meaningfulness of the output. For example, if I could only produce a capacity of a tenth of one percent, the meaningfulness would not even be an issue because nobody would want to send large messages via steganography anyway. A capacity of 10%, requiring the public text to be only 10 times as long as the hidden text, may be good enough for everyday use. If that can be achieved, then the next step is to see if meaningful output can have a high steganographic content. If so, then I expect that several cypherpunks would want to pursue that. (FYI: I plan to do more analysis on my own, even if nobody else does.) My guesstimate for the steganographic capacity of English did not provide a steganographic algorithm. For example, I haven't even looked into how to map a bit string to a parenthesis grouping; I was just noting that if you have X(N-1) possibilities, there must be log (X(N-1)) bits available, assuming all possibilities are equally likely. Is there a simple-to-compute mapping of the numbers 1 through X(N-1) to the X(N-1) parenthesizations of an N word sentence? Fortunately, N rarely gets large for ordinary English sentences, so a general solution may be unnecessary. My presentation mistakenly implied that a good steganographic algorithm may have the form: E(K, M) = E2( E1(K,M) ) where E1 is a cryptographically secure encryption function with public key K and hidden message M, E2 somehow converts the encrypted message to ordinary English text, and E1, K, and E2 are publicly known. Unfortunately, if the inverse of E2 (let's call it D2), is easily found, then the presence of a hidden message can be detected easily, even though that message cannot be decrypted easily. This is because the output of E1, which is incompressible, is easily distinguishable from D2(ordinary English text). Here is a better formulation for the steganographic schema: E(K1, K2, M) = E3( E2(K2, E1(K1,M) )) where: E1(K1,M) converts the hidden message M to a cryptographically secure cyphertext by using the key K1. E1 and K1 are public, but the decryption function D1 is difficult to compute without the private key PK1. E3(C) converts a bit string to ordinary-looking English text. Assume that both E3 and its inverse D3 are public. E2(K2, C) converts the cyphertext C into another bit string such that E2(K2, C) has the same statistical characteristics as D3(ordinary English text). Assume that E2 and K2 are public, but D2 is difficult to compute without the private key PK2. Function E1 is normal public key cryptography, which produces an incompressible cyphertext. I hope that function E3 has a high enough steganographic capacity to make steganalysis worthwhile. Function E2 cannot be a normal encryption function because its output needs to be as compressible as D3(ordinary English text). Both functions E2 and E3 are new types of functions that require more research to work well. I still haven't seen any references to this type of steganography being done before, but thanks to the various people who gave pointers to tools that may help in building it. Kevin Q. Brown INTERNET kqb@whscad1.att.com or kevin_q_brown@att.com
My favorite scheme was to encode messages in trailing spaces and/or tabs in netnews messages. You could also put internal tabs in place of spaces. In fact, you could do this with news messages that flow "through" your site, (if the messages aren't protected with a crypto checksum), so that you would not be the message's sender (and it wouldn't be addressed to anyone either -- recipients get very good privacy). This would be one way for a Unix "worm" program to report back to its master...and/or receive instructions. John Gilmore PS: You could put short interesting stuff just in your message-ID's! Not to mention the low order bits of timestamps (exactly *what* second did it arrive, now?).
In <9305252227.AA27968@toad.com>, John Gilmore wrote...
My favorite scheme was to encode messages in trailing spaces and/or tabs in netnews messages. You could also put internal tabs in place of spaces.
In fact, you could do this with news messages that flow "through" your site, (if the messages aren't protected with a crypto checksum), so that you would not be the message's sender (and it wouldn't be addressed to anyone either -- recipients get very good privacy).
This would be one way for a Unix "worm" program to report back to its master...and/or receive instructions.
John Gilmore
PS: You could put short interesting stuff just in your message-ID's! Not to mention the low order bits of timestamps (exactly *what* second did it arrive, now?).
/* Jonathan Stigelman, Stig@netcom.com, PGP public key by finger */ /* fingerprint = 32 DF B9 19 AE 28 D1 7A A3 9D 0B 1A 33 13 4D 7F */
participants (3)
-
gnu
-
kqb@whscad1.att.com
-
stig@netcom.com