Re: X.509, S/MIME, and evolution of PGP
Date: Wed, 27 Sep 1995 15:22:08 -0700 From: Bill Stewart <stewarts@ix.netcom.com>
Bill,
I'd always heard X.509 public key certificates were a hierarchical, evil, anti-WebOfTrust ISOism. But Netscape is now doing them, and talking S/MIME, so I sat down to read the specs, and they're really not all that bad.
I was all set to scream, when I read this first sentence. Then I read on.
(Technically, I've only read PKCS#6 and RFC 1422, and not the real ISOisms...) Yeah, they've got lots of clunky ASN.1 Ambiguous Encoding Rules and X.500 Silly Name Format, but those can be lived with, and the X.500 may be possible to simply ignore in most cases.
At this point, I realized that we agree in evaluation but not in weighting. Twice now, I have had to deal with X.509 certificates in real code and it is excruciatingly painful -- especially for someone, like me, with some background in performance engineering and much background in software engineering. ASN.1 is not merely ambiguous, it is actively wrong as part of a design methodology. It encourages people to define structures in the BNF style -- and they do (witness X.509). When you translate this into C or PASCAL structures by an automatic translator, you end up with structures whose definitions are nested so deeply that even with short field names you would have variable names which occupy a substantial part of a line of text. However, ASN.1's BNF-ness encourages people to use longNamesWithEmbeddedCapitals -- so you end up with variable names which turn routine C function calls into multi-line, unreadable blocks. You also end up with too much code. I recently had to deal with X.509 certs for an authentication application (a firewall proxy). The proxy was about 30KB of code prior to the ASN.1. The ASN.1 code, just to do packing and parsing, was over 100KB (.o file sizes, in both cases). You also end up with too many bytes being transferred. I worked out an example of ASN.1 abuse -- defining a triple-DES key structure for encrypting and transmittal -- as a raw C structure (following long established practice and performance engineering (an array of unsigned char, with offsets for each key and the IV)) and as ASN.1 (following modern ASN.1 practice). The raw C structure was 32 bytes long. The ASN.1 structure was 86 bytes. Worse was the code dedicated to structure definition and packing/unpacking. In the raw C case, it took 48 ASCII characters to define the structure and its offsets (including comments) and nothing to pack/unpack. With ASN.1, it took 55085 characters of definition, pack and unpack code. This is a factor of 1148 in source code expansion. -------------------------------- I could go on at length, and have in other fora. Not only is ASN.1 clearly the work of Satan, the Distinguished Name definition is more than a victim of ASN.1 generality, it is a clear reason for the unpopularity of systems which use it. Do you remember when X.400 names started showing up in e-mail (e.g., with Lotus Notes). How many of those names do you see now? It didn't work. The concept is flawed -- but it lives on in X.509. [It reminds me of a flaky grad student's idea of a way to do things -- elevated to standard before people had the chance to try it and discover how completely bogus it was.] It is possible to implement something which reads and writes ASN.1 -- but it is ugly, it inflates your code and it hurts your runtime. I would like to see as many hold-outs against ASN.1 and Distinguished Names as possible. PGP is one such. TIS/MOSS has learned its lesson (from PEM days) and is making all of the ASN.1 and DN stuff (X.509) optional. With luck, the X.509 parts will die away (although MOSS was retarded so strongly in the PEM days that it may never recover -- may never acquire the market share to make it a force). I would strongly encourage others to join the battle. This might not be easy. It is clear that there is an ASN.1 juggernaut. It is taking over all sorts of standards. I believe I know why. It makes the job of the standards writer easier. However, I also believe it needs to be fought...not merely to save future S/W development efforts from the waste and abuse which ASN.1 creates, but also to take a stand against the process by which non-implementors get together on standards committees and come out with standards which preclude good software architectures -- and who, in a kind of old-boy-network, endorse other standards (e.g., the ISO set) as part of their own. Such a design process is destructive. - Carl +--------------------------------------------------------------------------+ |Carl M. Ellison cme@tis.com http://www.clark.net/pub/cme | |Trusted Information Systems, Inc. http://www.tis.com/ | |3060 Washington Road PGP 2.6.2: 61E2DE7FCB9D7984E9C8048BA63221A2| |Glenwood MD 21738 Tel:(301)854-6889 FAX:(301)854-5363 | +--------------------------------------------------------------------------+
In his message Carl make several statements, some of which I agree with, and some which I disagree with. Since I'm a protocol wonk, and since I've been doing ASN.1 and PER/BER stuff recently, I'd like to respond to some of his points. (I spent most of last week off work with a nasty cold, semi-comatose on the couch, smothered in vapo-rub, surrounded by the 93 specs for ASN.1, BER/DER, and PER, and with nothing on TV but the OJ trial. What a choice :-) [A lot of this is leading up to a big rant about the truly ghastly packet formats given to us by STT, which I've found loosens more mucus than a gallon of cough syrup, and with much the same affect on your mental state :-)] I'm not going to be defending BER (BrainDamaged Encoding Rules), because, lets face it, they suck. I'm also not going to be defending X.500, because, to a first approximation it completely sucks too. In this message I'm just going to address the issue of the inherent in-efficiency- I'll address the rest in a follow-up message, most specifically the claim that making mashalling and de-mashalling hard on the implementor is a good thing. It's hard to speak to the issue of code size, since the ISODE compilers, which are frequently used as a benchmark in this area are so goddam awful. Even the most naive compilers will generally generate code orders of magnitude smaller. Instead, we'll take a look at bits on the wire, and compare the struct dump to what can be done by a 20th century compiler using a smart set of encoding rules (PER - the packed encoding rules). [ as a side note, I recently wrote some code that had to parse and process X.509 certificates - this was for my SSL Keep-Away proxy (it needed to crunch the certificate, look for hostname matches in any CN values, and possible convert the DN into RFC1485 text format). The source was only a few K. I was using C++ though (and this didn't hurt for once)] Lets use 3DES as our example. We'll start with a naive specification: -- LongLong ::= OCTET STRING (SIZE(8)) -- a long long is 8 bytes, er, long DesKey ::= LongLong ThreeDes ::= SEQUENCE { IV LongLong, K1 DesKey, K2 DesKey, K3 DesKey } -- Lets apply the packed encoding rules to this: ThreeDes is a SEQUENCE. It has no optional components, so no bits are added to the encoding. The first item, IV, is an OCTET STRING of fixed length 8 bytes. Since the length is fixed, no length is encoded - the 8 bytes of the IV are appended to the encoding. The same applies to each of the des keys. Thus, we have a bits on the wire total of 32 bytes. The same as in the hand crafted encoding. The encoding and decoding are then implemented as memcpys. If more information is known about the alignment and position in memory of the fields, and of the key within the buffer, these memcpys can be coaleced- this is a local optimisation, rather than a requirement that every interoperable implementation use the same language with the same compiler. Now, this example is pretty simple, but with not much thought, we can set about improving it to generate fewer bits on the wire. I'll avoid the obvious kludge, which is to strip of the parity bytes on each key to save three bytes - instead we'll look to the big wins. There are several different ways of using 3des which can help us reduce the size of the encodings in some cases. The first thing we can do is support variable size IVs (like in rfc1851). We'll restrict the IV to be either 1 or 2 32 bit chunks. Then we can add extra support for 1des mode of 3des where all the keys are the same. Here's the new definitions: -- Long ::= OCTET STRING (SIZE(4)) ThreeDes ::=SEQUENCE { IV SEQUENCE OF (SIZE(1..2) LONG, Key1 DesKey, Key2 DesKey OPTIONAL, Key3 DesKey OPTIONAL } -- Now lets see how the PER treat this value. The first thing we encode is the sequence. Since this sequence has optional components, we stick one bit onto the output stream for each field - if the bit is one, the optional element is present - otherwise, it ain't. Since there are two optional components, we need two bits. Next, we need to encode the IV. Since this field is of variable length, we do need to encode a length this time. The length is constrainted to be between 1 and 2 - a range of 1- the minimum number of bits needed to encode this is 1, and so a 1 bit field is appended to the encoding. Now we encode the longs in the IV; because these values are OCTET STRINGS, we need to align ourself on an octet boundary, if we're not there already. Once we've emitted any necessary pad bits, we encode the IV as the indicated number of 4 byte values. After that we encode the first key as described above, and if the second and third key are present, we encode those as well. If there is room for 3 bits in the byte preceding this encoding (a likely occurence, especially if the application supports several different key types (RC4 & IDEA, etc)), this encoding is still 32 bytes in the worst case, and 12 in the best case. To be continued... (unless I get flamed off the list)
(my friend Tom is cc:'d as the Multics expert, re: the ref below) -----BEGIN PGP SIGNED MESSAGE-----
Date: Mon, 2 Oct 1995 17:13:21 -0700 (PDT) From: Simon Spero <ses@tipper.oit.unc.edu>
Lets use 3DES as our example. We'll start with a naive specification:
[etc.] You're starting down the same road I did in writing my example of how ASN.1 seduces you into bad design. ..very good, but you stopped short. [The PER sounds much better than BER -- but I've never seen PER before. I learned enough about ASN.1 to have decided it is a lost cause -- far easier to let ASN.1 advocates talk to themselves while I go off to do something independent and good.] Back to the example.
-- LongLong ::= OCTET STRING (SIZE(8)) -- a long long is 8 bytes, er, long
Really? There is an OCTET STRING (SIZE(8)) and you can make it a datatype? I suppose you can make an OCTET STRING (SIZE(9)) too? That can be really convenient. You can have a tagged quantity (using the top byte). Alternatively, someone could define: DesKey ::= SEQUENCE { encr BIT STRING (SIZE(1)), -- encrypt mode if 1, decrypt if 0 value OCTET STRING (SIZE(8)) } and now you can use DesKey as your data type with no bad effects and only good ones (as far as the ASN.1 user is concerned). Of course, the code to pack/unpack just exploded. So did the packet size (maybe, depending on effort spent in pack/unpack) and so did the internal struct, probably. [Truth in advertising: the example above is adapted from early Multics where PL/I allowed you to do such nonsense and some programmer saw the power of it -- so he used it in the file system, until he got caught.] Lesson from this: there is a reason not to give a designer generality you would not use in an actual implementation. Anyway -- my example of ASN.1 abuse is along these lines but I won't reproduce it here. We can leave this as a parlor game for computer geeks. :-)
Here's the new definitions:
-- Long ::= OCTET STRING (SIZE(4))
ThreeDes ::=SEQUENCE { IV SEQUENCE OF (SIZE(1..2) LONG, Key1 DesKey, Key2 DesKey OPTIONAL, Key3 DesKey OPTIONAL }
See -- ASN.1 is powerful in its seductiveness. Even though you were trying to convince me that it can be the same as my primitive example (and therefore just as efficient), you couldn't resist using the power of the generality to elaborate on the structure. This is not a good feature of ASN.1. This is its primary fault. This is why I call it a work of Satan. (BER/DER helps in that evaluation, of course).
To be continued... (unless I get flamed off the list)
As I said, this could be a wonderful parlor game -- or list topic, if people want to waste the time. Think of it as the Crossword Puzzle page of the cypherpunks on-line newspaper. :-) - Carl +--------------------------------------------------------------------------+ |Carl M. Ellison cme@tis.com http://www.clark.net/pub/cme | |Trusted Information Systems, Inc. http://www.tis.com/ | |3060 Washington Road PGP 2.6.2: 61E2DE7FCB9D7984E9C8048BA63221A2| |Glenwood MD 21738 Tel:(301)854-6889 FAX:(301)854-5363 | +--------------------------------------------------------------------------+ -----BEGIN PGP SIGNATURE----- Version: 2.6.2 iQCVAwUBMHFNJFQXJENzYr45AQFV2AP9H1/A5bY4H8C/Ms3dhHIPOWiLCYhqLzFR qKdvQaBYvPDCrr8jXLwQhTogvzu/9gkZ2DwnXVya7MxEpyy+1A5WrO3Jlqu+6Euy bBcl1idhoomMzmzOga/F7YasXsFkoZoSqNYQKX/ZKcFvEuDGrzohlBNV5ubDEL7G E3hdsak0f2Y= =cjU/ -----END PGP SIGNATURE-----
participants (2)
-
Carl Ellison -
Simon Spero