[FoRK] X.509 certificate collision via MD5 collisions (fwd from jeff at k2.com)

Wed Mar 2 10:40:39 PST 2005

Eugen forwards from FoRK:
> >Colliding X.509 Certificates version 1.0
> >1st March 2005
> >Arjen Lenstra, Xiaoyun Wang, and Benne de Weger
> >
> >http://eprint.iacr.org/2005/067
> >
> >We announce a method for the construction of pairs of valid X.509
> >certificates in which the ?to be signed? parts form a collision for
> >the MD5 hash function. As a result the issuer signatures in the
> >certificates will be the same when the issuer uses MD5 as its hash
> >function.

The real news of the paper was the announcement that Wang's techniques
will be revealed this May at Eurocrypt.  I'm looking forward to finding
out what the secret is!  Presumably everyone will receive MD5 collision
finding software at around that time.

The cert collision is not a surprise, people anticipated this possibility
shortly after the MD5 collisions were announced.  And notice that Xiaoyun
Wang was an author of this paper; she was of course the lead author
on the original MD5 collision paper and presumably the originator of
the technique for finding MD5 collisions.  Using her technology it is
straightforward to do this kind of thing.  But no one else could have
written this paper at this time.

The only nontrivial part (given the remarkable ability to generate MD5
collisions) was arranging that both keys were valid RSA moduli with
known factors.  The did this by generating random bignums and trying to
factor them.

And keep in mind that her methods find random-ish collisions.  They don't
find matches to existing hashes, and (as far as we know) they don't
find structured collisions as would be necessary to get two certs with
different and plausible-sounding names in them.

>From what I've read (mostly http://eprint.iacr.org/2004/264), the way
these collisions are found is to start with analysis of the structure
of the hash, and decide on an XOR difference between the two inputs.
This implicitly makes certain assumptions about where and when carries
and other nonlinearities will occur in the hash calculation.  Then you
do a search for inputs which match that pattern of carries and for
which the pre-determined XOR difference yields an actual collision.
This doesn't give you much ability to control the content of the two
inputs that you create.

Hal