Information Content Estimation

Cathal Garvey cathalgarvey at cathalgarvey.me
Sun Oct 20 15:16:50 PDT 2013


In case anyone else was wondering, I answered my own question, although
I have yet to learn whether this is efficient for large files. The
answer so far appears to be libmagic, bindings for which in Python are
available as "filemagic". Can then be used like so:

-> import ssl, magic
-> M = magic.Magic()
-> M.id_buffer(ssl.RAND_bytes(50))
:: 'data'
-> M.id_buffer("This is text, plain and simple")
:: 'ASCII text, with no line terminators'
-> M.id_buffer("This is text, plain and simple\nand it has more than one
-> line")
:: 'ASCII text'
-> M.id_buffer(rand)
:: 'data'
-> M.close()

(For anyone who's read the docs, yes you're supposed to use a context
manager with filemagic, and no it doesn't work on Py3.3 near as I can
see)

So I'll run with this if nobody has better ideas; user submits data, if
it returns "data" on a check with filemagic/libmagic, it's considered
encrypted (because encrypted data should be indistinguishable from
random data), otherwise it's rejected.

On Sat, 19 Oct 2013 20:41:01 +0100
"Cathal Garvey (Phone)" <cathalgarvey at cathalgarvey.me> wrote:

> Hey all,
> Am mulling over a federated datastore for zero-knowledge web
> applications, using hashcash as a "commitment" price for otherwise
> gratis data storage. All very straightforward, but: Zero knowledge is
> as much for host protection as client protection. Hosts don't WANT
> plaintext.
> 
> Short of stupidly CPU-intensive stuff like letter counting, UTF8
> decoding, etc, how might a server verify it's receiving encrypted
> data? I was thinking a function that estimates apparent entropy and
> rejects anything that doesn't look random enough to be encrypted,
> what such functions are available, fast, widely implemented?

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <https://lists.cpunks.org/pipermail/cypherpunks/attachments/20131020/91074f19/attachment-0002.sig>


More information about the cypherpunks mailing list