On Wed, Apr 24, 2013 at 11:34 AM, Bryce Lynch wrote:
On Tue, Apr 23, 2013 at 3:29 PM, Bryan Bishop wrote:
Honestly, if you are worried about a publisher tracking you down for partaking in science, then I would (with bias) recommend pdfparanoia to strip out watermarks: https://github.com/kanzure/pdfparanoia
I find the Metadata Anonymization Toolkit useful for similar things:
Hm, it says: "Mat only removes metadata from your files, it does not anonymise their content, nor handle watermarking, steganography [...]" As far as I can tell (from HTTP server logs), businesses looking for "science violations" are searching by looking for the watermark strings, not the metadata in the pdf's headers. Here's some ip addresses you should block: http://diyhpl.us/wiki/users/superkuh/sdf But if we need to start stripping journal names, paper titles, author names, etc., from pdfs, I am not sure how we would re-assemble that information later, because anyone would be able to re-assemble that information and would to find the "science violators". - Bryan http://heybryan.org/ 1 512 203 0507