How do I know if its encrypted?
I keep seeing the idea that to keep out of trouble remailers and Data Havens should require that data be encrypted before it is accecpted. My question is how do I know it is encrypted? If I say that anyone sending me data to be massaged by my system must first encrypt it, how do I know they are in fact complying with that request? After all this is the area for the paranoid's to hang out in. I see some possible options. Most don't seem to workable. I could ... (1) Look at the incomming data, which of course would be impractical and defeat the whole idea. (2) Force them to pgp it, but that could be defeated by having enough of a pgp sig. that my system is fooled. Not to mention they must use *MY* idea of what good encryption is. After all I could say you must use my encryption software that has a backdoor I know of, i.e. clipper, or that costs money and can only be bought from me. This last point would be one way of making sure you made some money, but does seem impractial. (3) Peform a histogram analysis on it, if it doesn't pass a certain threshold reject the whole thing. Although cute, I don't like this one. (4) Encrypt it with my own key and decrypt it before squirting it back out. This doesn't seem to gain me anything though since it could be said that I still have the ability to look at the data. (5) Only acecpt data from a remailer or other service that would be guranteed to be encrypted. This seems like it would lead to a 'good ole boy' network that could exclude service providers it doesn't like. etc. etc. Whos to say that the data isn't encrypted? I could be hidding a real message in that eagale spread of a porn picture. 'Simple, don't allow porn but only accecpted images.' Well that certainly sounds like a can of worms waiting to squirm around your toes. Is the question of what is encrypted data similar to what material is porn or some other 'evil' data? Am I missing something in this simple requirment of only dealing with encrypted data? Or am I simply blowing something way out of proportion? -Mark ---------- Mark Oeltjenbruns marko@Millcomm.com N0CCQ SnipIt Research Finger for PGP key. 'My other key is 2048 bits.'
My question is how do I know it is encrypted? Calculate an entropy measure of some sort. Entropy is a measure of disributional skew. Maximum entropy means minimum skew. For human-readable text of any sort, the monogram entropy, i.e. the entropy of individual characters, will _always_ be detectably less than maximal. Encrypted text will always be near maximal. The two are easy to distinguish. ASCII-armored encrypted text will always be right at 6 bits per byte. For speed of implementation, you don't need even to look at much text. You can get a statistically significant measure quite quickly from the first couple of kilobytes. And since you're only really worried about detecting non-randomness, you don't even need to calculate the exact entropy but rather an approximation of it. This approximation can be done with entirely fixed point arithmetic, if you're a bit clever about it. A practical system would cut out a notch at 6/8 for ASCII armor, which would make approximation techniques a bit tricky. More practical is just to detect ASCII armor with a regular expression recognizer and de-armor it before the entropy check. Eric
Mark Oeltjenbruns wrote:
I keep seeing the idea that to keep out of trouble remailers and Data Havens should require that data be encrypted before it is accecpted. My question is how do I know it is encrypted? If I say that anyone sending me data to be massaged by my system must first encrypt it, how do I know they are in fact complying with that request? After all this is the area for the paranoid's to hang out in.
(1) Look at the incomming data, which of course would be impractical and defeat the whole idea.
Actually, no. If the remailed material is encrypted, then looking at it is harmless. (And if it is not....) The "ideal mix" neither looks at nor keeps records about remailed items, of course. The "nonideal mix" may easily insist on encryption. I won't go through the rest of the points here, but there's a key word here: entropy. Get familiar with it now (and not just 50 years from now, when the worms and the bacteria will be giving lectures). Abstractly, it is not possible to ever prove that a file is either encrypted or unencrypted. Practically, encrypted files have high entropy per character (characters appear with approximately equal frequency), while unencrypted files have relatively low entropy, reflecting the patterns and n-tuple clusterings in ordinary languages. Sophisticated entropy measures are available, and have been discussed here. But there's an easier approach: try to compress the file. An encrypted ( = high entropy) file will generally not compress, and may even expand in size. An ordinary message in English or Dutch or whatever, such as this one, will compress significantly, to perhaps half it's uncompressed size. (Quibblers, this is the place where your announce the precise compression seen...) --Tim May -- .......................................................................... Timothy C. May | Crypto Anarchy: encryption, digital money, tcmay@netcom.com | anonymous networks, digital pseudonyms, zero | knowledge, reputations, information markets, W.A.S.T.E.: Aptos, CA | black markets, collapse of governments. Higher Power: 2^859433 | Public Key: PGP and MailSafe available. Cypherpunks list: majordomo@toad.com with body message of only: subscribe cypherpunks. FAQ available at ftp.netcom.com in pub/tc/tcmay
In article <m0rSFsb-000kfuC@mill2.millcomm.com>, Mark Oeltjenbruns <marko@millcomm.com> wrote:
I keep seeing the idea that to keep out of trouble remailers and Data Havens should require that data be encrypted before it is accecpted. My question is how do I know it is encrypted? If I say that anyone sending me data to be massaged by my system must first encrypt it, how do I know they are in fact complying with that request? After all this is the area for the paranoid's to hang out in.
I see some possible options. Most don't seem to workable. I could ...
[...]
(2) Force them to pgp it, but that could be defeated by having enough of a pgp sig. that my system is fooled. Not to mention they must use *MY* idea of what good encryption is. After all I could say you must use my encryption software that has a backdoor I know of, i.e. clipper, or that costs money and can only be bought from me. This last point would be one way of making sure you made some money, but does seem impractial.
You're being contradictory here. First you say that you have no way of reliably knowing if it's PGP encrypted, then you say it forces them to use a certain type of encryption. Well, if they can fool your system, they aren't forced to use it! But that's not really my point. I think this method is very valid and could be used in the Real World (well, on the net, at least). Let's say you require everyone to use PGP. Well, if I don't trust PGP but I trust SpamCrypt, there is nothing stopping me from encrypting my data with SpamCrypt and THEN PGP and sending it off to your haven. You see the data is PGP-encrypted, you don't have the PGP key to decrypt it so you can't be accused of having the ability to look at my data, and - correct me if I'm wrong here - unless there's some specific mathematical relation between PGP and SpamCrypt, my encryption is as good as the STRONGEST layer. In fact, as you suggest, this is a good way to implement a pay-per-use haven. All you have to do is only let people use your proprietary data format. You can either sell just the program (a one-time fee, like selling accounts on your haven), or sell single-use keys. Single-use keys are not only a type of one-time pad (again, I may be wrong there), but can be presold in arbitrary lots so companies can buy many keys from you and resell them.
(5) Only acecpt data from a remailer or other service that would be guranteed to be encrypted. This seems like it would lead to a 'good ole boy' network that could exclude service providers it doesn't like. etc. etc.
I don't see the problem with this. You will end up with two types of havens - those that will accept data from a remailer anyone can use (or from a remailer that will accept data from one somewhere down the line) and those that won't. The former are free for anyone to use, and the latter aren't. Sound like the difference between moderated and unmoderated newsgroups? And I don't hear anyone on cypherpunks bitching about the evil of moderated newsgroups. What would be wrong with setting up a haven just for you and your friends? Or anyone but Detweiler? Or anyone but me? Waiting to be called from some mistake, -Spam -- -- Spam is: Steve Marting <spam@telerama.lm.com> My homepage Beer status: Pilsener bottled and aging
On Wed, 11 Jan 1995, Mark Oeltjenbruns wrote:
(3) Peform a histogram analysis on it, if it doesn't pass a certain threshold reject the whole thing. Although cute, I don't like this one.
Why not -- sounds cool to me. It is also very fast, and does not take much programming. It will stop all cleartext. Probably some pictures would get through, so it would not stop mailbombings, but a volume limitation per apparent user and apparent destination would stop mailbombings. A volume limitation sounds like a lot of work to program though. --------------------------------------------------------------------- We have the right to defend ourselves and our property, because of the kind of animals that we http://nw.com/jamesd/ are. True law derives from this right, not from James A. Donald the arbitrary power of the omnipotent state. jamesd@netcom.com
participants (5)
-
eric@remailer.net -
James A. Donald -
marko@millcomm.com -
spam@telerama.lm.com -
tcmay@netcom.com