‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ On Sunday, January 24, 2021 4:02 PM, David Barrett <dbarrett@expensify.com> wrote:
Hi all, I'm the CEO a company called Expensify, developing a new open source chat application at https://Expensify.cash. I was pretty prolific on the p2p-hackers mailing list back in the day, but this is my first post to Cypherpunk, so... hi!
I'm writing because I've read a bunch of overviews of *how* the protocol is designed, but I haven't really seen good sources explaining (in a dumbed-down enough manner for me to follow) *why* it was designed that way. In general, I'm trying to understand the reasoning behind the Signal protocol in terms of what real world attacks each of its features defends against, because on the surface it seems _really_ complicated in order to protect against only a very narrow range of attacks.
However, I imagine every bit of that complexity has a purpose, and thus I figured this was the right place to try to get to the bottom of that purpose. If you have some other link that you think does a great job breaking it down, I'm all ears! Until then, here are some questions I'd love your help with:
1) This is perhaps an obvious question (I've got to start somewhere, after all), but what is the downside of the simplest possible solution, which I think would be for all participants to publish a public key to some common key server, and then for each participant in the chat to simply re-encrypt the message N-1 times -- once for each participant in the chat (minus themselves) using each recipient's public key?
Arguably the simplest method is to do what you describe. However, public-key crypto produces a shared-number of ~256-4096 bits. If the message is longer than this, these shared-secret bits must be "stretched" without revealing the secret. This is why (nearly all) public-key crypto systems are paired with some symmetric cipher. And even if you drop the symmetric cipher, you cannot use the shared-number directly since bias in bits must be removed. Typically the process is public_key/secret_key->shared_secret->key_derivation(hash)->cipher You might be interested in reading the SpiderOak chat room open specification too.
2) I would think the most significant problem with this ultra-simple design is just performance: asymmetric encryption is just too expensive in practice to use for every single message, especially given the O(n) cost of re-encrypting for every single recipient. Accordingly, the vast majority of the complexity of the Signal protocol is really just a performance optimization exercise to securely create (and rotate) a shared symmetric key that enables high-performance encryption/decryption. Is that right? Said another way, if asymmetric encryption had adequate performance for a given use case, would that obviate most of the advantage of the Signal protocol?
If you had short messages and used ECC, then doing direct encryption without a cipher is arguably safer (less systems) and reasonably efficient. But every 32-bytes sent requires a 32-byte key and a reasonable bump in CPU time (assuming x25519). The protocol would still be annoying because of all of the keys involved but probably workable if the message size is constrained.
3) The other problem with the ultra-simple design would be a lack of forward secrecy: if anyone's private key were compromised, all historical messages sent to them could be decrypted (assuming the attacker had stored them). But even with the Signal protocol, if someone's local device were compromised (the most likely way for an attacker to get the private key), they're screwed anyway, in two different ways:
a. The act of compromising the private key would likely ALSO compromise nearly all past communications. So all that hard work to rotate historical keys is kind of pointless if the messages themselves are compromised on device.
The participants can delete messages. Without forward secrecy, an adversary that has collected+stored the ciphertext can recover the plaintext from their stored ciphertext copy once they obtain the shared-secret used in the chat. With forward secrecy, the participants delete older shared-secrets preventing an adversary from recovering deleted messages. Also, cracking the single public-key would allow the recovery of all plaintexts; with forward secrecy the adversary is forced to crack multiple keys. This helps with privacy as the crypto and bit-length ages.
b. If the private key itself (which is *only* stored on the device) were compromised *without* also revealing the messages stored on that same device (though how this would happen I'm struggling to understand), then an attacker could still decrypt all past communications if they had a perfect record going back to the start.
Once the participants delete the secret components of the key-exchange, recovery from ciphertext should not be possible without breaking the crypto primitives. Certainly possible, but there isn't a technique publicly known.
4) So in summary, is it safe to say that the primary real-world advantages of the Signal protocol over the super-basic solution are:
a. achieve high performance encryption using frequently rotated symmetric keys, and
I'm not sure you'd want to design a system without symmetric ciphers for the reasons I listed above. Although it could be an interesting experiment I guess.
b. prevent an attacker who has compromised the device AND has a partial record of communications from decrypting messages that are no longer stored on that device.
If you change "partial record" to "complete record" then this sentence is accurate.
Is that the gist of it? Again, these are real high level questions, so I appreciate your patience with me! -david Founder and CEO of Expensify
Lee