Encrypted search?

newer
Slashdot | EU Parliament Approves...

Tyler Durden

21 Sep 2003 21 Sep '03

7:04 p.m.

Got a crypto question here. Let's say I push out a list I'd like to keep secret to some client machine. The user of that machine must enter some ID or other piece of information. I want the client machine to perform a search of that ID vs the contents of a list (again, resident locally on that machine), but I don't want the user to be able to see the other entries of that list. Possible? Remember, after the initial push of data out to the client machine, no more messages are exchanged. This means the list must be sent in encrypted form. When the search is performed, the "stupid" thing to do (I think...someone correct me) is to take the user's ID, encrypt it, and then determine if matches an encypted member of the list (and I don't see encrypted each entry individually as a desirable thing). I am assuming that this allows a savvy user to reverse-engineer the encryption. Another option is one I don't have the background at this stage to understand. Let's assume the entire list has been encrypted in one shot. Is there some function such that when this encrypted list is convolved with the user ID a "Yes" or "no" can be obtained (indicating presence or absence from the list)? If the answer is yes, I'd also like to know if knowing this is farily basic to most encryption professionals sphere of knowledge... -TD _________________________________________________________________ Get McAfee virus scanning and cleaning of incoming attachments. Get Hotmail Extra Storage! http://join.msn.com/?PAGE=features/es

Show replies by date

Joseph Ashwood

22 Sep 22 Sep

3:43 a.m.

----- Original Message ----- From: "Tyler Durden" <camera_lumina@hotmail.com> To: <cypherpunks@minder.net> Sent: Sunday, September 21, 2003 3:45 PM Subject: Encrypted search?

...

Got a crypto question here.

Let's say I push out a list I'd like to keep secret to some client machine. The user of that machine must enter some ID or other piece of information. I want the client machine to perform a search of that ID vs the contents of a list (again, resident locally on that machine), but I don't want the user to be able to see the other entries of that list.

Possible? Remember, after the initial push of data out to the client machine, no more messages are exchanged. This means the list must be sent in encrypted form.

Actually sending the list in encrypted form will create holes, the key is to not send the list, but to send the information that allows a member to see that they are on the list.

...

When the search is performed, the "stupid" thing to do (I think...someone correct me) is to take the user's ID, encrypt it, and then determine if matches an encypted member of the list (and I don't see encrypted each

entry

...

individually as a desirable thing). I am assuming that this allows a savvy user to reverse-engineer the encryption.

...

Another option is one I don't have the background at this stage to understand. Let's assume the entire list has been encrypted in one shot. Is there some function such that when this encrypted list is convolved with

Correct that won't work. A smarter idea would be to use the user ID and password to key encryption of a quantity (see Unix password system which is very similar, but lacked the presence of the user ID). the

...

user ID a "Yes" or "no" can be obtained (indicating presence or absence from the list)?

I believe the answer is no, at least not without leaking large quantities of information. It is possible I am wrong, but there are simpler, more straight forward solutions.

...

Based on my interpretation of the problem there are a number of solutions. One fairly straight forward one is: Assuming you have a single file to protect (or a single group of files), and don't need to protect the number of people who have access, the simple solution is to use a method very similar to PGP, but without the key identifiers. While this is just a quick sketch the file undergoes approximately the following: establish public keys for each member of the group (e.g. hash(passphrase, username) = priv, use priv as private key in ECC, everyone can use the same group) Choose 2 random keys (K1, K2), and 2 random IVs for CBC (IV1, IV2) MAC the plaintext file using CBC-MAC, key is K2, IV = IV2 postpend the MAC to the file Encrypt the file using K1 in CBC mode, IV = IV1 For each member in the group take their public key, and construct a shared secret (your choice how), that shared secret is used as the key to encrypt K1 and K2, this (known length) encrypted value is PersonalText[i] The new file format is: number of members of group (n) PersonalText[0] ... PersonalText[n] IV1 IV2 encrypted file On each access the accessing person iterates through the PersonalText list, for each decrypted value (remember there is no authenticator on the value to save room, and raise the cost of determining membership in bulk) perform a full decrypt and MAC verification, if the MAC verifies the decryption is correct. This is rather similar to what PGP and others use for multi-target encryption, but to speed the process they include key ids of some kind, that is effectively the only change (assuming proper choices for omissions). Proving the security of this is more difficult as there is a possibility that the correlations given by PersonalText[0,n] may provide improved methods of breakage, this can be addressed using hashes and random numbers in the PersonalText. However assuming ECC is equivalent to DH key agreement (almost certainly), the determination of whether a given PersonalText[i] is for User U is a simple variation of the Decision Diffie-Hellman problem.

...

If the answer is yes, I'd also like to know if knowing this is farily basic to most encryption professionals sphere of knowledge...

Probably not, until they realize that it can be solved with a reuse of PGP-type messages without key ids, at which point it should be well within their knowledge. Joseph Ashwood Trust Laboratories Changing Software Development http://www.trustlaboratories.com

Dave Howe

5:21 a.m.

Tyler Durden wrote:

...

When the search is performed, the "stupid" thing to do (I think...someone correct me) is to take the user's ID, encrypt it, and then determine if matches an encypted member of the list (and I don't see encrypted each entry individually as a desirable thing). I am assuming that this allows a savvy user to reverse-engineer the encryption. What you do is hash the ID, then compare it to the list of hashed entries, using the ID as the key to decrypt the data associated with that entry while that isn't subject to reverse engineering, the abuse it *is* open to is random guessing of IDs (every "success" gives someone else's record, with failures having no penalty) Adding a password (and combining it with the ID to give your key) will address some of that, but really you need to encrypt each entry individually to prevent someone simply decompiling your code and obtaining your full data list.

...

Another option is one I don't have the background at this stage to understand. Let's assume the entire list has been encrypted in one shot. Is there some function such that when this encrypted list is convolved with the user ID a "Yes" or "no" can be obtained (indicating presence or absence from the list)? no. if you trial encrypt the sample ID for comparison, you hand them the key to the whole list.

Peter Wayner

24 Sep 24 Sep

7:59 a.m.

At 10:11 AM +0100 9/22/03, Dave Howe wrote:

...

Tyler Durden wrote:

...
When the search is performed, the "stupid" thing to do (I think...someone correct me) is to take the user's ID, encrypt it, and then determine if matches an encypted member of the list (and I don't see encrypted each entry individually as a desirable thing). I am assuming that this allows a savvy user to reverse-engineer the encryption. What you do is hash the ID, then compare it to the list of hashed entries, using the ID as the key to decrypt the data associated with that entry while that isn't subject to reverse engineering, the abuse it *is* open to is random guessing of IDs (every "success" gives someone else's record, with failures having no penalty) Adding a password (and combining it with the ID to give your key) will address some of that, but really you need to encrypt each entry individually to prevent someone simply decompiling your code and obtaining your full data list.

...
Another option is one I don't have the background at this stage to understand. Let's assume the entire list has been encrypted in one shot. Is there some function such that when this encrypted list is convolved with the user ID a "Yes" or "no" can be obtained (indicating presence or absence from the list)? no. if you trial encrypt the sample ID for comparison, you hand them the key to the whole list.

Yes, these are all good solutions. If you want a case study of how this might help a company like Amazon, go here: http://www.wayner.org/books/td/u1.php --------------------------------------- My new books: _Policing Online Games_ (http://www.wayner.org/books/pog/) _Java RAMBO Manifesto_ (http://www.wayner.org/books/rambo/)

Thomas Shaddack

22 Sep 22 Sep

10:49 a.m.

On Sun, 21 Sep 2003, Tyler Durden wrote:

...

Got a crypto question here.

Let's say I push out a list I'd like to keep secret to some client machine. The user of that machine must enter some ID or other piece of information. I want the client machine to perform a search of that ID vs the contents of a list (again, resident locally on that machine), but I don't want the user to be able to see the other entries of that list.

Sounds quite like a web access filtering problem.

...

Possible? Remember, after the initial push of data out to the client machine, no more messages are exchanged. This means the list must be sent in encrypted form.

Try a list of hashes of the IDs. Of course this would work only if the bruteforcing of the index would be impractical; eg, it may somehow work for longer email addresses, but probably won't work for phone numbers.

...

When the search is performed, the "stupid" thing to do (I think...someone correct me) is to take the user's ID, encrypt it, and then determine if matches an encypted member of the list (and I don't see encrypted each entry individually as a desirable thing). I am assuming that this allows a savvy user to reverse-engineer the encryption.

Good hash (MD5, SHA1, etc. - their other advantage is that their code is out there and you don't have to write it) can't be reversed. However, they can be bruteforced, if the input set isn't impractically big (eg, a set of phone numbers from one area code is less than 10,000 possibilities, which is trivial to take, calculate a hash for every one of them, and check if the database contains it).

...

Another option is one I don't have the background at this stage to understand. Let's assume the entire list has been encrypted in one shot. Is there some function such that when this encrypted list is convolved with the user ID a "Yes" or "no" can be obtained (indicating presence or absence from the list)?

Yes. Hash is there / hash is not there.

...

If the answer is yes, I'd also like to know if knowing this is farily basic to most encryption professionals sphere of knowledge...

I suppose so. Similar approach is used for accelerating searches in long lists of data. Maybe there is better approach, my crypto-fu isn't good enough yet to do more than kibitzing.

ken

1:35 p.m.

Tyler Durden wrote:

...

Let's say I push out a list I'd like to keep secret to some client machine. The user of that machine must enter some ID or other piece of information. I want the client machine to perform a search of that ID vs the contents of a list (again, resident locally on that machine), but I don't want the user to be able to see the other entries of that list.

[...]

...

When the search is performed, the "stupid" thing to do (I think...someone correct me) is to take the user's ID, encrypt it, and then determine if matches an encypted member of the list (and I don't see encrypted each entry individually as a desirable thing). I am assuming that this allows a savvy user to reverse-engineer the encryption.

This is, roughly, how traditional Unix password security works. Reverse-engineering the encryption may or may not be possible, and ought not to matter if you have used a strong enough method and long enough keys. And anyway, if this is running on the client machine then they already have a program that can do the work. What is possible is brute-forcing by encrypting the whole dictionary and trying every word one by one. If your algorithm is strong enough, your key long enough, and above all if the space from which the plaintext is taken is large enough, than this sort of approach can be made sort of safe enough for most applications. But why bother?

Ralf-P. Weinmann

24 Sep 24 Sep

10:06 a.m.

On Sun, Sep 21, 2003 at 06:45:21PM -0400, Tyler Durden wrote:

...

Got a crypto question here.

Let's say I push out a list I'd like to keep secret to some client machine. The user of that machine must enter some ID or other piece of information. I want the client machine to perform a search of that ID vs the contents of a list (again, resident locally on that machine), but I don't want the user to be able to see the other entries of that list.

Have a look at "Searchable Public Key Encryption" by Boneh et al [1] and Song, Wagner and Perring's paper "Practical Techniques for Searches on Encrypted Data" [2]. Cheers, Ralf [1] D. Boneh, G. Di Crescenzo, R. Ostrovsky and G. Persiano, Searchable Public Key Encryption, IACR ePrint 2003/195 http://eprint.iacr.org/2003/195/ [2] D. Song, D. Wagner and A. Perrig, Practical Techniques for Searches on Encrypted Data, in Proc. of the 2000 IEEE symposium on Security and Privacy (S&P 2000). -- Ralf-P. Weinmann <weinmann@cdc.informatik.tu-darmstadt.de>

8003

Age (days ago)

8006

Last active (days ago)

List overview

Download

6 comments

7 participants

participants (7)

Dave Howe
Joseph Ashwood
ken
Peter Wayner
Ralf-P. Weinmann
Thomas Shaddack
Tyler Durden