cypherpunks-legacy archives 1992-2013
Back in July there was discussion about preserving the available archives. I have now made the available archives from 1992-2013 available via Mailman, and placed copies of the available mbox files within. Visit them here: https://lists.cpunks.org/pipermail/cypherpunks-legacy/ The main info page, which links to the sources and a description of what was involved in getting them into Mailman, is here: https://lists.cpunks.org/mailman/listinfo/cypherpunks-legacy Finally, here is the README-processing file that describes what I did, and why I decided to not try to incorporate those older archives with the current list archives at https://lists.cpunks.org/pipermail/cypherpunks/. Suggestions welcome. -------- Processing of Cypherpunks Archives Available archives of the Cypherpunks email list are incomplete, and in fact there is evidence they have been tampered with and/or redacted over the years. This project was to do some basic clean-up of available archives, which are in mbox format, so that they could be ingested and be viewed within Mailman. The archives contain many poorly formed messages, and Mailman defaults to the current date (December 2019, at the time of this writing) when it encounters a problem with the date. So, a separate 'cypherpunks-legacy' list was deployed to make the archives available without overlapping with the current active 'cypherpunks' list, which goes back to July 20 2013. Otherwise, the legacy archive messages would have been peppered into the current archives, in ways that would be difficult to predict or undo. There were especially many anomalies from the larger source, spanning 1999-2015. In addition to many poorly formed messages (i.e., messages that, in one way or another, could not be cleanly ingested with the Mailman 'arch') comment, there were invalid dates, and lines that had an errant "From " at the start. To ready the sources for ingestion to Mailman two automated tools were utilized, followed by some ad hoc edits and changes: 1. 'sortmbox.py' uses a Python library to put messages in by-date order. This proved to be less confusing to Mailman (i.e., fewer messages were inserted to the current month). 2. 'cleanarch' (/var/lib/mailman/bin/cleanarch) is part of the Mailman package. It fixes errant "From " entries at the start of lines. 3. I also used 'sed' to replace invalid dates with valid ones that were in the same ballpark day+time as the message. I found these either when 'arch' complained (such as for dates before the Unix epoch), or when Mailman was showing messages in the future: 's/ 0101 / 1999 /' | sed 's/ 0102 / 1999 /' 's/Thu Dec 31 22:40:39 1903/Thu Jul 5 22:40:39 2018/g' 's/Jan 1904/Jul 2018/' 's/Date: Sun, 1 Apr 2029 03:07:16 +0200/Date: Sat, 31 Mar 2001 15:59:46 -0800/' 's/Date: Fri, 3457 Jan 4 61400:2064:61300 +0200/Wed May 29 15:00:02 2013/' There might have been a few other small edits made within the files, which I didn't record, simply to help Mailman to do a better job of creating browsable archives. 4. I then concatenated all the mbox files (one each for 1992-1998, plus one larger file for 1999-2015), and reran sortmbox.py and cleanarch. 5. I edited the resulting file to remove everything after the new list archives were set up, on July 20 2013. To do the work above, I created a temporary Mailman list, and repeatedly used the 'arch' command to ingest the archives and fix problems. This was an iterative process. Once 'arch' was giving sane output, I created a new Mailman list, 'cypherpunks-legacy.' I put the single unified + fixed mbox file where Mailman tools would find it: /var/lib/mailman/archives/public/cypherpunks-legacy.mbox/cypherpunks-legacy.mbox At this point, I could use the Mailman to slurp the mbox files in, and create the browsable structure. The 'arch' command: /var/lib/mailman/bin/arch --wipe cypherpunks-legacy This served to populate the list archives, which are browsable here: https://lists.cpunks.org/pipermail/cypherpunks-legacy The single large mbox file resulting from the steps above is linked at the top of the Archives page. Here is a direct link: https://lists.cpunks.org/pipermail/cypherpunks-legacy.mbox/cypherpunks-legac... (615MB, containing approximately 180149 messages) Please be aware that Mailman's placement of messages by author, date, subject and thread - including the correct by-month folder - is not perfect. Messages sometimes end up in the wrong place, and sometimes threads are split across different years. Also note that these mbox files did not include attachments. The current Mailman archive does include attachments, but this legacy archive does not. It seems they were not included with the archive input sources (though it's possible some messages have MIME-encoded attachments within them). Anyone interested in doing a serious dig into the archive should also consult the original mbox files. These can be ingested into any capable email client program, and viewed as separate messages. They may be sorted and searched, just like any other email folder. The same sorts of issues as those described above will likely be evident in any email client, and clients will even show a different total number of messages. The types of sorting, editing, and displaying described above would have somewhat different results, if a different toolset is used. These archives are freely available, and the effort to make them available via Mailman is freely given. - gbn -------- Earlier thread on this concluded below: Subject: Re: newsflash! cypherpunks mailing list is behind cloudflare-NSA On Fri, Jul 12, 2019 at 06:34:07PM -0400, grarpamp wrote:
On 7/12/19, Greg Newby <gbnewby@pglaf.org> wrote:
Newsflash! This happened in April, and was announced here: https://lists.cpunks.org/pipermail/cypherpunks/2019-April/045250.html We have been on Cloudflare's DNS since then for the email lists.
Use of CF or any other CDN was not mentioned in the announcement, whether for DNS, or HTTPS. The entire internet is NSA anyway.
If CDN for HTTPS, consider multihoming on I2P or Tor so users can still access when CDN javascript captcha or otherwise arbitrarily blocks them or goes down.
As to caching bandwidth and archives...
You really should fork that 335MiB mbox file off now or no later than year end, and compress it, and then once yearly thereafter, and sign them all. People will eventually seed them into IPFS, etc.
Try using a modern unix compression tool like zstd, they are faster, smaller, available for all systems...
https://github.com/facebook/zstd https://facebook.github.io/zstd/ https://code.fb.com/core-data/zstandard/ https://en.wikipedia.org/wiki/Zstandard
On Tue, Dec 10, 2019 at 01:46:37PM -0800, Greg Newby wrote:
Back in July there was discussion about preserving the available archives. I have now made the available archives from 1992-2013 available via Mailman, and placed copies of the available mbox files within.
Visit them here: https://lists.cpunks.org/pipermail/cypherpunks-legacy/ ...
That's some dedicated work. Thank you. Are you able to provide the mbox in 7z or 'tar -caf mb.xz' ?
On Wed, Dec 11, 2019 at 09:17:45AM +1100, Zenaan Harkness wrote:
On Tue, Dec 10, 2019 at 01:46:37PM -0800, Greg Newby wrote:
Back in July there was discussion about preserving the available archives. I have now made the available archives from 1992-2013 available via Mailman, and placed copies of the available mbox files within.
Visit them here: https://lists.cpunks.org/pipermail/cypherpunks-legacy/ ...
That's some dedicated work. Thank you.
My pleasure. Those gnarly mbox files were more of a challenge than I had anticipated at first.
Are you able to provide the mbox in 7z or 'tar -caf mb.xz' ?
Yes, that was easy enough: Done. Note also that the Apache server should be offering capable clients to compress on-the-fly as part of the HTTP negotiation (gzip). The 1992-1998 archives were already part of a bzip2: cypherpunks-199209-199812.tar.bz2 (John has shared essentially the same files as a .zip), so I didn't make a .xz of it. I made an .xz of the cypherpunks-legacy.mbox (i.e., the exact input to the 'arch' command). Also cypherpunks-1999-2015.mbox (the archive from before Mailman, and the first ~2 years of Mailman), and also the processed/ directory (i.e., post-processed stuff that was input to the first file in this paragraph). I also made sha512sum output in sha512sum.txt (I could also make md5sum, but that is no longer considered cryptographically secure, according to its man page). Direct links: https://lists.cpunks.org/pipermail/cypherpunks-legacy.mbox/sources/cypherpun... (this is what you asked for, I think). https://lists.cpunks.org/pipermail/cypherpunks-legacy.mbox/sources/processed... https://lists.cpunks.org/pipermail/cypherpunks-legacy.mbox/sources/cypherpun... Let me know if other variations would be useful. It's no problem to point wget or whatever at the server, if you want to copy some or all of these files (including the browse-by-thread etc.). For completeness and so everyone gets a copy, here are the sha512sums: 81c1cc1b096a6216485fa2445be098485f292a973af3f0bc9376f15ec199e783118952d08d7a3370f3ae47e7192aa67a1a4affb75fa0a5a70f97adee948ed034 *cypherpunks-199209-199812.tar.bz2 a1e67f1898a7e09913983f76bb95bd5a3e284e13438331fd0234d29d979583b967dfe478e42444526df17786c9de5600400c45e3045406bb6272e29bc6c7a501 *cypherpunks-1999-2015.mbox e00109d1761a5546205f1ef39c2c05f7ff7cc2f91c7d2a4bd5a2c342a0690cdb6c8514a5037f19775374b9e0511e663217798abe02dfde7bfeb01f543a45dd35 *cypherpunks-1999-2015.mbox.xz b2f202ee931d06541e4127f6008771659ed86b9f32606875261c254836d0ad630c160276c11d019dfbafb189c3269b5ae78dfb5b395c89b901ef75fc8af50a48 *cypherpunks-legacy.mbox 65952cc183e724be2481e9f95d7443872f0523e281d5da3cc4aa3be8ba22d56fd04b35f4dff642ea2e6aecaf2683ea69d15e06f29ffb1b0cbd329bb9e7e9875e *cypherpunks-legacy.mbox.xz e04a6aa95a7e060e72d2d4be5b8e52c5ceb5fc6831e8951fe68ca417bcd5d1f794433add775b742917243e5e98228f2b6698255147b7dcfd53e8f5f6f1b543eb *processed.xz 3d4ea45095f4ea373a0a7d609fb70160437f3b23a7456f4ebd95d4a5e35ef4290bde1e312e47f3aa532142aa1ae98194e01e003f32bd0d5249aeb6aa9b910463 *README-sources 4db8840e46ba9c8476b2f95c9425d92a68c54a935d38225603113891e9d5a41e6947e33dcaf2b5a51c24833c0ee85c64401739067c0ca0c15c6f2def01d75365 *cypherpunks-199209-199812/cyp-1992 3598b07c74f224b7bf7adbd68bd24ba3a6c232ec755089f4a14f54c643d19d9f6c0aec40f4acde99e89dfe02070c435c3b8285513edc9d14fade32408435c8c8 *cypherpunks-199209-199812/cyp-1993 43d39dc2ca13de28288ef562e030679555063695a8c6da955a06736f2d1dc6bffd04c8b32eff92bba3c09916fd9cfb21df1c5e70fb387d47b2366a8f89bb96f6 *cypherpunks-199209-199812/cyp-1994 4ba34837f8c13907188ccd97b3f7b754d9eaf70b21b0183e81c2f822b555fc5fa63aaa50d96ee94f7d8dc6f9cddc6d2e5133c20c0bcd162f56901dc5edfc6da5 *cypherpunks-199209-199812/cyp-1995 d5c4ea8845f064c426e8f48d8f8cc84ab039a297e335facfa9f52b4fb669a6d95c069680323327be454d67af7c4d08cc28bf247afabf03f18a85c42deba8483e *cypherpunks-199209-199812/cyp-1996 b4732735ac88dded905ce7daf9e19303e9a129af1f099d9dd30ecd24a9d6f7c8d7dc7c56b34c0a736f599970644f9d6dbd49fc9bc25115850c0c4f0bcbd1d4f0 *cypherpunks-199209-199812/cyp-1997 992e46fcad5bd2954737b514e712042fe98caba93bfe212c0a57814648b1e2156fd0dec32b8c55267217e69670c770f6c2036a58259ce69b891c03f8851a7215 *cypherpunks-199209-199812/cyp-1998 a82ecfcff9f787a89413a401873bd1ae9fe28d4bba12c133ed3f01df9767849cc49e0e8ce3e34c2da43eba8c61d847ff8f8412850356f9310bc9c5b0635182ee *processed/cyp-1992-new 83c1e11af1197528878990a977937bbcea4ea69029902702e1586a09196b3b3b798e736fcf7ad05adde26730a55b27a83b1912d8cbcb76a11f3357ab4e8567a1 *processed/cyp-1993-new 6b0c46ad47644832615a9457dcfb419c39f916bc2a8a98f57b8d7df9266212304ef81ae6817b37493b69e06dded28f1acadaf437a7cf8d023e7b9e6e420e4e7e *processed/cyp-1994-new 0427c28eb9a12ce0b9cab34a364da16abfdfe0842fbb8999087592dd577bfb3546e33571cd79312be58b02523e356d7856a2d9a0c0780002f73fb337974a6c07 *processed/cyp-1995-new b1ad74d43d0a4e23f157340dc28a88c2ca3a99c59599901a0c056d693fc734b91830748d4970ac47c8a25ab5e1b8dfe32b7c0986005d571b233f2c2d802a8dfb *processed/cyp-1996-new 1fb752d7ccd1e8236c6c7eebba6838abcb6514d2e358d4033b3698acc71a433f17e770632903dc7c60933c1c47314b21adb5a8d4e246764a0162ab9a5d5e3211 *processed/cyp-1997-new c1d70b69e4c7e03968f791ad6a3822b86ab0ca4b5646798bb8e9c6be8d5bd05bd7034008a8e6e87b8535436a1706765415912ff1e745e8aff491a6ae27038db9 *processed/cyp-1998-new ea30a7e0a3bf64de77f99b0a22d82507d8913d4630542d0aa6180af418f2372703063aadd2c6ca0eafe7e2a0be5893a821edffca8c0aa670859144c094e85c14 *processed/cyp-1999-2015.mbox-new 72e50c1023a629a68d1d426b0bb8c6aaa45a4b3f48eb02c2ea1b7963922631c6ab57c3a81a4678ece88426700067a8fe02cf14cb807570441ae989872e9f56ca *processed/cyp-unified 294100a510d8639be848e6efcfc931f38b3dbcd2f3dd87592632d4b49c5d06bcd3cc5fa643a3128e11d6c549fbf99f7879531116ac52eba88de6228e2e92f703 *processed/cyp-unified.sorted b2f202ee931d06541e4127f6008771659ed86b9f32606875261c254836d0ad630c160276c11d019dfbafb189c3269b5ae78dfb5b395c89b901ef75fc8af50a48 *processed/cyp-unified.sorted-1999-July2013 271d88ddc51e996e6a23f126f0a395cc303101fa8d2686aeb04fec3e5f97cc24c5081d3aaf0fbf54b0e457da0fbd3a4ff2e904d011c1d9a05a2d6850112d0d3d *processed/README-processing 0477f4aa28628036a9416fbefce11e28ee89070fe0ecbd05accef8ae0781ba70bb6344d16c1c4327fb2aa1304512cb54d29c5a602fe39eb0967604bd57886142 *processed/sortmbox.py.txt - Greg
participants (3)
-
Greg Newby
-
John Young
-
Zenaan Harkness