[liberationtech] data mine the snowden files [was: open the snowden files]

coderman coderman at gmail.com
Wed Jul 9 07:04:06 PDT 2014


On Tue, Jul 8, 2014 at 3:27 PM, grarpamp <grarpamp at gmail.com> wrote:
> ...
> To do any of this you will need to collect all the releases of docs
> and images to date, in their original format (not AP newsspeak),
> in one place. Then dedicate much time to normalizing, convert to
> one format and import into tagged document store, etc. Yes, this
> could be hosted on the darknet.

indeed. i will also be hosting the complete cryptome archive on hidden
site, as it too is part of this corpus to feed into a normalization
and extraction engine of great justice.  i am using the various python
image processing libraries to accomplish this but any language or tool
could be useful.

i had hoped to distribute the cryptome archives further during the
Paris hackfest, alas, unexpected events conspired otherwise.

anyone who would like to host mirrors is welcome to tell me how they
anticipate mirroring ~30G of data as quickly as possible. :)



More information about the cypherpunks mailing list