[wrong][spam] Cracking PGP
PGP is a puzzle like any other. Let's break it open and destroy everyone's encryption! To perform this task, I will be using an i386 from 1989. The code will be written in qbasic and executed in ms-dos.
I have no freakin' idea how PGP works but I understand it involves multiplying numbers that have enough bitwidth that the people trying to read encrypted messages secured by it get all confused and distracted by how many many bits wide these prime integers multiplication are.
I looked it up on the nets.
RSA functions around: (m**e)**d congruent m mod n (m**d)**e congruent m mod n 0 <= m <= n I think "a congruent b mod n" means that a - b is an integer multiple of n. The public key is e and the private key is d. So, m is the message and m**d is the encrypted data. Code for this is in libcrypto.
On Fri, Jun 18, 2021, 1:28 PM Karl <gmkarl@gmail.com> wrote:
I looked it up on the nets.
RSA functions around:
(m**e)**d congruent m mod n (m**d)**e congruent m mod n 0 <= m <= n
I think "a congruent b mod n" means that a - b is an integer multiple of n.
The public key is e and the private key is d. So, m is the message and m**d is the encrypted data.
Code for this is in libcrypto.
* libgcrypt
n is also in the public key. It's generated from two prime numbers which I'm guessing are e and d, but I saw them called p and q while I was still looking at https://en.m.wikipedia.org/wiki/RSA_(cryptosystem)#Key_generation . Having some trouble continuing.
Here's the key generation section from the Wikipedia article. It's hard to move my body toward this. ===Key generation=== The keys for the RSA algorithm are generated in the following way: # Choose two distinct [[prime number]]s ''p'' and ''q''. #* For security purposes, the integers ''p'' and ''q'' should be chosen at random, and should be similar in magnitude but differ in length by a few digits to make factoring harder.<ref name="rsa"/> Prime integers can be efficiently found using a [[primality test]]. #* ''p'' and ''q'' are kept secret. # Compute {{nowrap|1=''n'' = ''pq''}}. #* ''n'' is used as the [[Modular arithmetic|modulus]] for both the public and private keys. Its length, usually expressed in bits, is the [[key length]]. #* ''n'' is released as part of the public key. # Compute ''λ''(''n''), where ''λ'' is [[Carmichael's totient function]]. Since ''n'' = ''pq'', ''λ''(''n'') = [[least common multiple|lcm]](''λ''(''p''),''λ''(''q'')), and since ''p'' and ''q'' are prime, ''λ''(''p'') = ''[[Euler totient function|φ]]''(''p'') = ''p'' − 1 and likewise ''λ''(''q'') = ''q'' − 1. Hence ''λ''(''n'') = lcm(''p'' − 1, ''q'' − 1). #* ''λ''(''n'') is kept secret. #* The lcm may be calculated through the [[Euclidean algorithm]], since lcm(''a'',''b'') = ''|ab|''/gcd(''a'',''b''). # Choose an integer ''e'' such that {{nowrap|1 < ''e'' < ''λ''(''n'')}} and {{nowrap|1=[[greatest common divisor|gcd]](''e'', ''λ''(''n'')) = 1}}; that is, ''e'' and ''λ''(''n'') are [[coprime]]. #* ''e'' having a short [[bit-length]] and small [[Hamming weight]] results in more efficient encryption {{snd}} the most commonly chosen value for ''e'' is {{nowrap|1=2<sup>16</sup> + 1 = 65,537}}. The smallest (and fastest) possible value for ''e'' is 3, but such a small value for ''e'' has been shown to be less secure in some settings.<ref name="Boneh99"> {{cite journal | url = http://crypto.stanford.edu/~dabo/abstracts/RSAattack-survey.html | last = Boneh | first = Dan | title = Twenty Years of attacks on the RSA Cryptosystem | journal = [[Notices of the American Mathematical Society]] | volume = 46 | issue = 2 | pages = 203–213 | year = 1999 }}</ref> #*''e'' is released as part of the public key. # Determine ''d'' as {{nowrap|''d'' ≡ ''e''<sup>−1</sup> (mod ''λ''(''n''))}}; that is, ''d'' is the [[modular multiplicative inverse]] of ''e'' modulo ''λ''(''n''). #* This means: solve for ''d'' the equation {{nowrap|''d''⋅''e'' ≡ 1 (mod ''λ''(''n''))}}; ''d'' can be computed efficiently by using the [[Extended Euclidean algorithm]], since, thanks to ''e'' and ''λ''(''n'') being coprime, said equation is a form of [[Bézout's identity]], where ''d'' is one of the coefficients. #* ''d'' is kept secret as the ''private key exponent''. The ''public key'' consists of the modulus ''n'' and the public (or encryption) exponent ''e''. The ''private key'' consists of the private (or decryption) exponent ''d'', which must be kept secret. ''p'', ''q'', and ''λ''(''n'') must also be kept secret because they can be used to calculate ''d''. In fact, they can all be discarded after ''d'' has been computed.<ref>Applied Cryptography, John Wiley & Sons, New York, 1996. [[Bruce Schneier]], p. 467</ref>
I'll go through it a little. On Fri, Jun 18, 2021, 1:58 PM Karl <gmkarl@gmail.com> wrote:
===Key generation=== The keys for the RSA algorithm are generated in the following way: # Choose two distinct [[prime number]]s ''p'' and ''q''.
#* ''p'' and ''q'' are kept secret.
# Compute 'n'' = ''pq''
public n = private p * private q n is keylength bits long. # Compute ''λ''(''n''), where ''λ'' is [[Carmichael's totient function]].
Since ''n'' = ''pq'', ''λ''(''n'') = [[least common multiple|lcm]](''λ''(''p''),''λ''(''q'')), and since ''p'' and ''q'' are prime, ''λ''(''p'') = ''[[Euler totient function|φ]]''(''p'') = ''p'' − 1 and likewise ''λ''(''q'') = ''q'' − 1. Hence ''λ''(''n'') = lcm(''p'' − 1, ''q'' − 1). #* ''λ''(''n'') is kept secret.
lambda(n) is calculated and has a number of relevant algebraic identities associated #* The lcm may be calculated through the [[Euclidean algorithm]], since
lcm(''a'',''b'') = ''|ab|''/gcd(''a'',''b''). # Choose an integer ''e'' such that {{nowrap|1 < ''e'' < ''λ''(''n'')}} and {{nowrap|1=[[greatest common divisor|gcd]](''e'', ''λ''(''n'')) = 1}}; that is, ''e'' and ''λ''(''n'') are [[coprime]]. #* ''e'' having a short [[bit-length]] and small [[Hamming weight]] results in more efficient encryption {{snd}} the most commonly chosen value for ''e'' is {{nowrap|1=2<sup>16</sup> + 1 = 65,537}}. The smallest (and fastest) possible value for ''e'' is 3, but such a small value for ''e'' has been shown to be less secure in some settings.<ref name="Boneh99"> {{cite journal | url = http://crypto.stanford.edu/~dabo/abstracts/RSAattack-survey.html | last = Boneh | first = Dan | title = Twenty Years of attacks on the RSA Cryptosystem | journal = [[Notices of the American Mathematical Society]] | volume = 46 | issue = 2 | pages = 203–213 | year = 1999 }}</ref> #*''e'' is released as part of the public key. # Determine ''d'' as {{nowrap|''d'' ≡ ''e''<sup>−1</sup> (mod ''λ''(''n''))}}; that is, ''d'' is the [[modular multiplicative inverse]] of ''e'' modulo ''λ''(''n''). #* This means: solve for ''d'' the equation {{nowrap|''d''⋅''e'' ≡ 1 (mod ''λ''(''n''))}}; ''d'' can be computed efficiently by using the [[Extended Euclidean algorithm]], since, thanks to ''e'' and ''λ''(''n'') being coprime, said equation is a form of [[Bézout's identity]], where ''d'' is one of the coefficients. #* ''d'' is kept secret as the ''private key exponent''. The ''public key'' consists of the modulus ''n'' and the public (or encryption) exponent ''e''. The ''private key'' consists of the private (or decryption) exponent ''d'', which must be kept secret. ''p'', ''q'', and ''λ''(''n'') must also be kept secret because they can be used to calculate ''d''. In fact, they can all be discarded after ''d'' has been computed.<ref>Applied Cryptography, John Wiley & Sons, New York, 1996. [[Bruce Schneier]], p. 467</ref>
Maybe clean it up some On Fri, Jun 18, 2021, 2:00 PM Karl <gmkarl@gmail.com> wrote:
I'll go through it a little.
On Fri, Jun 18, 2021, 1:58 PM Karl <gmkarl@gmail.com> wrote:
===Key generation=== The keys for the RSA algorithm are generated in the following way: # Choose two distinct [[prime number]]s ''p'' and ''q''.
#* ''p'' and ''q'' are kept secret.
# Compute 'n'' = ''pq''
public n = private p * private q n is keylength bits long.
# Compute ''λ''(''n''), where ''λ'' is Carmichael's totient function.
Since ''n'' = ''pq'', ''λ''(''n'') = lcm(''λ''(''p''),''λ''(''q'')), and since ''p'' and ''q'' are prime, ''λ''(''p'') = ''φ''(''p'') = ''p'' − 1 where φ is the Euler totient function, and likewise ''λ''(''q'') = ''q'' − 1. Hence ''λ''(''n'') = lcm(''p'' − 1, ''q'' − 1)
private lambda(n) is calculated and has a number of relevant algebraic identities associated
#* The lcm may be calculated through the Euclidean algorithm, since
lcm(''a'',''b'') = ''|ab|''/gcd(''a'',''b''). # Choose an integer ''e'' such that 1 < ''e'' < ''λ''(''n'') and gcd(''e'', ''λ''(''n'')) = 1; that is, ''e'' and ''λ''(''n'') are coprime.
Organised a little up to here. #* ''e'' having a short [[bit-length]] and small [[Hamming weight]] results
in more efficient encryption {{snd}} the most commonly chosen value for ''e'' is {{nowrap|1=2<sup>16</sup> + 1 = 65,537}}. The smallest (and fastest) possible value for ''e'' is 3, but such a small value for ''e'' has been shown to be less secure in some settings.<ref name="Boneh99"> {{cite journal | url = http://crypto.stanford.edu/~dabo/abstracts/RSAattack-survey.html | last = Boneh | first = Dan | title = Twenty Years of attacks on the RSA Cryptosystem | journal = [[Notices of the American Mathematical Society]] | volume = 46 | issue = 2 | pages = 203–213 | year = 1999 }}</ref> #*''e'' is released as part of the public key. # Determine ''d'' as {{nowrap|''d'' ≡ ''e''<sup>−1</sup> (mod ''λ''(''n''))}}; that is, ''d'' is the [[modular multiplicative inverse]] of ''e'' modulo ''λ''(''n''). #* This means: solve for ''d'' the equation {{nowrap|''d''⋅''e'' ≡ 1 (mod ''λ''(''n''))}}; ''d'' can be computed efficiently by using the [[Extended Euclidean algorithm]], since, thanks to ''e'' and ''λ''(''n'') being coprime, said equation is a form of [[Bézout's identity]], where ''d'' is one of the coefficients. #* ''d'' is kept secret as the ''private key exponent''. The ''public key'' consists of the modulus ''n'' and the public (or encryption) exponent ''e''. The ''private key'' consists of the private (or decryption) exponent ''d'', which must be kept secret. ''p'', ''q'', and ''λ''(''n'') must also be kept secret because they can be used to calculate ''d''. In fact, they can all be discarded after ''d'' has been computed.<ref>Applied Cryptography, John Wiley & Sons, New York, 1996. [[Bruce Schneier]], p. 467</ref>
The keys for the RSA algorithm are generated in the following way: # Choose two distinct [[prime number]]s ''p'' and ''q''.
#* ''p'' and ''q'' are kept secret.
# Compute 'n'' = ''pq''
public n = private p * private q n is keylength bits long.
# Compute ''λ''(''n''), where ''λ'' is Carmichael's totient function.
Since ''n'' = ''pq'', ''λ''(''n'') = lcm(''λ''(''p''),''λ''(''q'')), and since ''p'' and ''q'' are prime, ''λ''(''p'') = ''φ''(''p'') = ''p'' − 1 where φ is the Euler totient function, and likewise ''λ''(''q'') = ''q'' − 1. Hence ''λ''(''n'') = lcm(''p'' − 1, ''q'' − 1)
private lambda(n) is calculated and has a number of relevant algebraic identities associated
#* The lcm may be calculated through the Euclidean algorithm, since
lcm(''a'',''b'') = ''|ab|''/gcd(''a'',''b''). # Choose an integer ''e'' such that 1 < ''e'' < ''λ''(''n'') and gcd(''e'', ''λ''(''n'')) = 1; that is, ''e'' and ''λ''(''n'') are coprime.
#* ''e'' having a short bit-length and small Hamming weight results in
more efficient encryption -- the most commonly chosen value for ''e'' is 2**16 + 1 = 65,537. The smallest (and fastest) possible value for ''e'' is 3, but such a small value for ''e'' has been shown to be less secure in some settings. #*''e'' is released as part of the public key.
Trying to clean up a little more. I'm certain I can complete this before the next millennium. ===Key generation=== public e is often a very simple reused constant # Determine ''d'' as ''d'' ≡ ''e''**−1 (mod ''λ''(''n'')); that is, ''d''
is the modular multiplicative inverse of ''e'' modulo ''λ''(''n'').
Organised a little up to here.
#* This means: solve for ''d'' the equation {{nowrap|''d''⋅''e'' ≡ 1 (mod
''λ''(''n''))}}; ''d'' can be computed efficiently by using the [[Extended Euclidean algorithm]], since, thanks to ''e'' and ''λ''(''n'') being coprime, said equation is a form of [[Bézout's identity]], where ''d'' is one of the coefficients. #* ''d'' is kept secret as the ''private key exponent''. The ''public key'' consists of the modulus ''n'' and the public (or encryption) exponent ''e''. The ''private key'' consists of the private (or decryption) exponent ''d'', which must be kept secret. ''p'', ''q'', and ''λ''(''n'') must also be kept secret because they can be used to calculate ''d''. In fact, they can all be discarded after ''d'' has been computed.<ref>Applied Cryptography, John Wiley & Sons, New York, 1996. [[Bruce Schneier]], p. 467</ref>
The keys for the RSA algorithm are generated in the following way: # Choose two distinct [[prime number]]s ''p'' and ''q''.
#* ''p'' and ''q'' are kept secret.
# Compute 'n'' = ''pq''
public n = private p * private q n is keylength bits long.
# Compute ''λ''(''n''), where ''λ'' is Carmichael's totient function.
Since ''n'' = ''pq'', ''λ''(''n'') = lcm(''λ''(''p''),''λ''(''q'')), and since ''p'' and ''q'' are prime, ''λ''(''p'') = ''φ''(''p'') = ''p'' − 1 where φ is the Euler totient function, and likewise ''λ''(''q'') = ''q'' − 1. Hence ''λ''(''n'') = lcm(''p'' − 1, ''q'' − 1)
private lambda(n) is calculated and has a number of relevant algebraic identities associated
#* The lcm may be calculated through the Euclidean algorithm, since
lcm(''a'',''b'') = ''|ab|''/gcd(''a'',''b''). # Choose an integer ''e'' such that 1 < ''e'' < ''λ''(''n'') and gcd(''e'', ''λ''(''n'')) = 1; that is, ''e'' and ''λ''(''n'') are coprime.
#* ''e'' having a short bit-length and small Hamming weight results in
more efficient encryption -- the most commonly chosen value for ''e'' is 2**16 + 1 = 65,537. The smallest (and fastest) possible value for ''e'' is 3, but such a small value for ''e'' has been shown to be less secure in some settings. #*''e'' is released as part of the public key.
===Key generation=== public e is often a very simple reused constant
# Determine ''d'' as ''d'' ≡ ''e''**−1 (mod ''λ''(''n'')); that is, ''d''
is the modular multiplicative inverse of ''e'' modulo ''λ''(''n'').
#* This means: solve for ''d'' the equation ''d''⋅''e'' ≡ 1 (mod
''λ''(''n'')); ''d'' can be computed efficiently by using the Extended Euclidean algorithm, since, thanks to ''e'' and ''λ''(''n'') being coprime, said equation is a form of Bézout's identity, where ''d'' is one of the coefficients. #* ''d'' is kept secret as the ''private key exponent''. The ''public key'' consists of the modulus ''n'' and the public (or encryption) exponent ''e''. The ''private key'' consists of the private (or decryption) exponent ''d'', which must be kept secret. ''p'', ''q'', and ''λ''(''n'') must also be kept secret because they can be used to calculate ''d''. In fact, they can all be discarded after ''d'' has been computed.
Okay! I removed all the wiki markup!
Basic summary: "e" and "d" are computed using private "p" , "q", and lambda(n), so that their properties hold. These computations are done using an algebra of modular arithmetic. Totient functions, greatest common multiples, and many other things. When solving the systems involved in "e" and "d", that algebra would be discovered.
Here are further words from Wikipedia, to learn against later. https://en.m.wikipedia.org/wiki/RSA_(cryptosystem) In the original RSA paper,[2] <https://en.m.wikipedia.org/wiki/RSA_%28cryptosystem%29#cite_note-rsa-2> the Euler totient function <https://en.m.wikipedia.org/wiki/Euler_totient_function> *φ*(*n*) = (*p* − 1)(*q* − 1) is used instead of *λ*(*n*) for calculating the private exponent *d*. Since *φ*(*n*) is always divisible by *λ*(*n*) the algorithm works as well. That the Euler totient function <https://en.m.wikipedia.org/wiki/Euler_totient_function> can be used can also be seen as a consequence of Lagrange's theorem <https://en.m.wikipedia.org/wiki/Lagrange%27s_theorem_%28group_theory%29> applied to the multiplicative group of integers modulo pq <https://en.m.wikipedia.org/wiki/Multiplicative_group_of_integers_modulo_n>. Thus any *d* satisfying *d*⋅*e* ≡ 1 (mod *φ*(*n*)) also satisfies *d*⋅*e* ≡ 1 (mod *λ*(*n*)). However, computing *d* modulo *φ*(*n*) will sometimes yield a result that is larger than necessary (i.e. *d* > *λ*(*n*)). Most of the implementations of RSA will accept exponents generated using either method (if they use the private exponent *d* at all, rather than using the optimized decryption method based on the Chinese remainder theorem <https://en.m.wikipedia.org/wiki/RSA_%28cryptosystem%29#Using_the_Chinese_remainder_algorithm> described below), but some standards such as FIPS 186-4 <http://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.186-4.pdf#page=62> may require that *d* < *λ*(*n*). Any "oversized" private exponents not meeting that criterion may always be reduced modulo *λ*(*n*) to obtain a smaller equivalent exponent. Since any common factors of (*p* − 1) and (*q* − 1) are present in the factorisation of *n* − 1 = *pq* − 1 = (*p* − 1)(*q* − 1) + (*p* − 1) + (*q* − 1),[17] <https://en.m.wikipedia.org/wiki/RSA_%28cryptosystem%29#cite_note-17> it is recommended that (*p* − 1) and (*q* − 1) have only very small common factors, if any besides the necessary 2.[2] <https://en.m.wikipedia.org/wiki/RSA_%28cryptosystem%29#cite_note-rsa-2>[18] <https://en.m.wikipedia.org/wiki/RSA_%28cryptosystem%29#cite_note-18>[19] <https://en.m.wikipedia.org/wiki/RSA_%28cryptosystem%29#cite_note-19>[20] <https://en.m.wikipedia.org/wiki/RSA_%28cryptosystem%29#cite_note-20> Note: The authors of the original RSA paper carry out the key generation by choosing *d* and then computing *e* as the modular multiplicative inverse <https://en.m.wikipedia.org/wiki/Modular_multiplicative_inverse> of *d* modulo *φ*(*n*), whereas most current implementations of RSA, such as those following PKCS#1 <https://en.m.wikipedia.org/wiki/PKCS1>, do the reverse (choose *e* and compute *d*). Since the chosen key can be small whereas the computed key normally is not, the RSA paper's algorithm optimizes decryption compared to encryption, while the modern algorithm optimizes encryption instead.[2] <https://en.m.wikipedia.org/wiki/RSA_%28cryptosystem%29#cite_note-rsa-2>[21] <https://en.m.wikipedia.org/wiki/RSA_%28cryptosystem%29#cite_note-21>
The libgcrypt rsa code is mostly visible at https://dev.gnupg.org/source/libgcrypt/browse/master/cipher/rsa.c . (I had to work around some network issues reaching git.gnupg.org, and found dev.gnupg.org which works for me.) The operations of use start around line 909. Here's a link to that line: https://dev.gnupg.org/source/libgcrypt/browse/master/cipher/rsa.c$909 This source can also be cloned at https://dev.gnupg.org/source/libgcrypt.git . I've cloned it locally. The basic operations of use are public() and secret(). They basically just call out to mpi_powm, which is a wrapper for gcry_mpi_powm(). There's a #define in gcrypt.h, which is a different file generated from gcrypt.h.in . Before moving off rsa.c, it's notable that: - public() basically just wraps mpi_powm, using the same structures - secret() has an additional step to remove leading zeros - secret() has a special form that might be used when p and q are known called secret_core_crt() - there's something else called secret_blinded() that is likely documented in a header file or elsewhere Basically, everything of interest is likely in mpi_powm. I'm taking it slower now, because complexity will increase as we get deeper. mpi_powm could look frighteningly new.
[to state the obvious, few people on this cryptography list are using pgp, and I appear to have been influenced not to as well; I'm just trying to help myself process both situations. do you use pgp? do you believe there is any reason to? this is a cryptography list.] The mpi_powm implementations are in mpi/mpi-pow.c : https://dev.gnupg.org/source/libgcrypt/browse/master/mpi/mpi-mpow.c What's exciting, is there are actually two different implementations in that one file. The old one called "simple exponentiation" starts on line 49 and goes through line 353. The second one occupies the second half of the file and starts on line 405. They're both quite big, but a lot of it looks like boilerplate. To learn modular algorithms from this file most efficiently for cracking PGP as a software developer, we want to review the parts of these implementations that directly relate to what is needed to implement modular arithmetic.
participants (1)
-
Karl