I'm spending a fair amount of time at NARA II. Any thoughts on how I might be able to automate retrieval of documents from the CREST database? There are about 10 million pages of CIA docs that haven't even been accessed, much less made their way online. Since 2000, CIA has installed and maintained an electronic full-text searchable system, which it has named CREST (the CIA Records Search Tool), at NARA II in College Park, Maryland. *Over 11 million pages have been released in electronic format and reside on the CREST database, from which researchers have printed about 1.1 million pages. **In order to directly access CREST, a researcher must visit the National Archives at College Park, Maryland.* http://www.foia.cia.gov/collection/crest-25-year-program-archive
Can anyone access the CREST db? Do I need any specail permission? On Sat, Jan 16, 2016 at 2:55 PM, Michael Best <themikebest@gmail.com> wrote:
I'm spending a fair amount of time at NARA II. Any thoughts on how I might be able to automate retrieval of documents from the CREST database? There are about 10 million pages of CIA docs that haven't even been accessed, much less made their way online.
Since 2000, CIA has installed and maintained an electronic full-text searchable system, which it has named CREST (the CIA Records Search Tool), at NARA II in College Park, Maryland. *Over 11 million pages have been released in electronic format and reside on the CREST database, from which researchers have printed about 1.1 million pages. **In order to directly access CREST, a researcher must visit the National Archives at College Park, Maryland.* http://www.foia.cia.gov/collection/crest-25-year-program-archive
Anyone can access it, but only from NARA II at College Park, MD. I'm there on a semi-regular basis and want to figure out a way to efficiently liberate the docs that are only accessible from there....because that's some weird-ass censorship. On Sat, Jan 16, 2016 at 7:05 PM, Martin Becze <mjbecze@gmail.com> wrote:
Can anyone access the CREST db? Do I need any specail permission?
On Sat, Jan 16, 2016 at 2:55 PM, Michael Best <themikebest@gmail.com> wrote:
I'm spending a fair amount of time at NARA II. Any thoughts on how I might be able to automate retrieval of documents from the CREST database? There are about 10 million pages of CIA docs that haven't even been accessed, much less made their way online.
Since 2000, CIA has installed and maintained an electronic full-text searchable system, which it has named CREST (the CIA Records Search Tool), at NARA II in College Park, Maryland. *Over 11 million pages have been released in electronic format and reside on the CREST database, from which researchers have printed about 1.1 million pages. **In order to directly access CREST, a researcher must visit the National Archives at College Park, Maryland.* http://www.foia.cia.gov/collection/crest-25-year-program-archive
How exactly is the accesses setup? Can we arbitrary software that accesses DB? On Sat, Jan 16, 2016 at 7:07 PM, Michael Best <themikebest@gmail.com> wrote:
Anyone can access it, but only from NARA II at College Park, MD. I'm there on a semi-regular basis and want to figure out a way to efficiently liberate the docs that are only accessible from there....because that's some weird-ass censorship.
On Sat, Jan 16, 2016 at 7:05 PM, Martin Becze <mjbecze@gmail.com> wrote:
Can anyone access the CREST db? Do I need any specail permission?
On Sat, Jan 16, 2016 at 2:55 PM, Michael Best <themikebest@gmail.com> wrote:
I'm spending a fair amount of time at NARA II. Any thoughts on how I might be able to automate retrieval of documents from the CREST database? There are about 10 million pages of CIA docs that haven't even been accessed, much less made their way online.
Since 2000, CIA has installed and maintained an electronic full-text searchable system, which it has named CREST (the CIA Records Search Tool), at NARA II in College Park, Maryland. *Over 11 million pages have been released in electronic format and reside on the CREST database, from which researchers have printed about 1.1 million pages. **In order to directly access CREST, a researcher must visit the National Archives at College Park, Maryland.* http://www.foia.cia.gov/collection/crest-25-year-program-archive
I'll take notes on the access and interface when I'm there next, hopefully next week. On Sat, Jan 16, 2016 at 7:16 PM, Martin Becze <mjbecze@gmail.com> wrote:
How exactly is the accesses setup? Can we arbitrary software that accesses DB?
On Sat, Jan 16, 2016 at 7:07 PM, Michael Best <themikebest@gmail.com> wrote:
Anyone can access it, but only from NARA II at College Park, MD. I'm there on a semi-regular basis and want to figure out a way to efficiently liberate the docs that are only accessible from there....because that's some weird-ass censorship.
On Sat, Jan 16, 2016 at 7:05 PM, Martin Becze <mjbecze@gmail.com> wrote:
Can anyone access the CREST db? Do I need any specail permission?
On Sat, Jan 16, 2016 at 2:55 PM, Michael Best <themikebest@gmail.com> wrote:
I'm spending a fair amount of time at NARA II. Any thoughts on how I might be able to automate retrieval of documents from the CREST database? There are about 10 million pages of CIA docs that haven't even been accessed, much less made their way online.
Since 2000, CIA has installed and maintained an electronic full-text searchable system, which it has named CREST (the CIA Records Search Tool), at NARA II in College Park, Maryland. *Over 11 million pages have been released in electronic format and reside on the CREST database, from which researchers have printed about 1.1 million pages. **In order to directly access CREST, a researcher must visit the National Archives at College Park, Maryland.* http://www.foia.cia.gov/collection/crest-25-year-program-archive
I spoke to someone from the list privately and did a little digging - it looks like the files can't be downloaded or saved electronically, at all, period. They have to be printed instead, from one of four computers which are located in College Park, MD (transparency!). So now I'm working on a plan to print and scan files from CREST. Given the scale (11+ million pages)... I plan to be down there on Tuesday; I'll get more information and solid numbers (document counts, print speeds, etc.) and see where things are from there. Keep your fingers crossed, I may actually be able to get something. --Mike On Sat, Jan 16, 2016 at 7:18 PM, Michael Best <themikebest@gmail.com> wrote:
I'll take notes on the access and interface when I'm there next, hopefully next week.
On Sat, Jan 16, 2016 at 7:16 PM, Martin Becze <mjbecze@gmail.com> wrote:
How exactly is the accesses setup? Can we arbitrary software that accesses DB?
On Sat, Jan 16, 2016 at 7:07 PM, Michael Best <themikebest@gmail.com> wrote:
Anyone can access it, but only from NARA II at College Park, MD. I'm there on a semi-regular basis and want to figure out a way to efficiently liberate the docs that are only accessible from there....because that's some weird-ass censorship.
On Sat, Jan 16, 2016 at 7:05 PM, Martin Becze <mjbecze@gmail.com> wrote:
Can anyone access the CREST db? Do I need any specail permission?
On Sat, Jan 16, 2016 at 2:55 PM, Michael Best <themikebest@gmail.com> wrote:
I'm spending a fair amount of time at NARA II. Any thoughts on how I might be able to automate retrieval of documents from the CREST database? There are about 10 million pages of CIA docs that haven't even been accessed, much less made their way online.
Since 2000, CIA has installed and maintained an electronic full-text searchable system, which it has named CREST (the CIA Records Search Tool), at NARA II in College Park, Maryland. *Over 11 million pages have been released in electronic format and reside on the CREST database, from which researchers have printed about 1.1 million pages. **In order to directly access CREST, a researcher must visit the National Archives at College Park, Maryland.* http://www.foia.cia.gov/collection/crest-25-year-program-archive
it looks like the files can't be downloaded or saved electronically, at all, period. They have to be printed instead
This is such bullshit On Sat, Jan 16, 2016 at 11:03 PM, Michael Best <themikebest@gmail.com> wrote:
I spoke to someone from the list privately and did a little digging - it looks like the files can't be downloaded or saved electronically, at all, period. They have to be printed instead, from one of four computers which are located in College Park, MD (transparency!). So now I'm working on a plan to print and scan files from CREST. Given the scale (11+ million pages)...
I plan to be down there on Tuesday; I'll get more information and solid numbers (document counts, print speeds, etc.) and see where things are from there. Keep your fingers crossed, I may actually be able to get something.
--Mike
On Sat, Jan 16, 2016 at 7:18 PM, Michael Best <themikebest@gmail.com> wrote:
I'll take notes on the access and interface when I'm there next, hopefully next week.
On Sat, Jan 16, 2016 at 7:16 PM, Martin Becze <mjbecze@gmail.com> wrote:
How exactly is the accesses setup? Can we arbitrary software that accesses DB?
On Sat, Jan 16, 2016 at 7:07 PM, Michael Best <themikebest@gmail.com> wrote:
Anyone can access it, but only from NARA II at College Park, MD. I'm there on a semi-regular basis and want to figure out a way to efficiently liberate the docs that are only accessible from there....because that's some weird-ass censorship.
On Sat, Jan 16, 2016 at 7:05 PM, Martin Becze <mjbecze@gmail.com> wrote:
Can anyone access the CREST db? Do I need any specail permission?
On Sat, Jan 16, 2016 at 2:55 PM, Michael Best <themikebest@gmail.com> wrote:
I'm spending a fair amount of time at NARA II. Any thoughts on how I might be able to automate retrieval of documents from the CREST database? There are about 10 million pages of CIA docs that haven't even been accessed, much less made their way online.
Since 2000, CIA has installed and maintained an electronic full-text searchable system, which it has named CREST (the CIA Records Search Tool), at NARA II in College Park, Maryland. *Over 11 million pages have been released in electronic format and reside on the CREST database, from which researchers have printed about 1.1 million pages. **In order to directly access CREST, a researcher must visit the National Archives at College Park, Maryland.* http://www.foia.cia.gov/collection/crest-25-year-program-archive
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 On Sat, 16 Jan 2016 23:08:58 -0500 Martin Becze <mjbecze@gmail.com> wrote:
it looks like the files can't be downloaded or saved electronically, at all, period. They have to be printed instead This is such bullshit
Print everything to PDF? Some shell hackery with wget or cURL and pipe through ImageMagick's convert command? - -- The Doctor [412/724/301/703/415] [ZS] PGP: 0x807B17C1 / 7960 1CDC 85C9 0B63 8D9F DD89 3BD8 FF2B 807B 17C1 WWW: https://drwho.virtadpt.net/ "We're going to start building everything out of bamboo and duct tape." -----BEGIN PGP SIGNATURE----- iQIcBAEBCgAGBQJWnWRAAAoJED1np1pUQ8RkyHsP/2vo7db4cc04HDAaYKl9fPMX VcZgUXOCYdeifGu49HVMHyu/+1ZHmAaE/OesY8emHmEapf5TeeFxwHgSeoqfJUS1 Vm/3jzzPFmsZJfeG7tc5fzJecNRxJCfYSmBTT13Km/P7Z7bXiLCX2H2/JKlVgN7H 3WUTogGtdrHzxAs0TLnCuTss/pdyp23sfx3Qt6YEYfqwSUkWF3EVaRozdEbVwQ+i Ze2qyk2EGe2MXUMp/huz+NdaPAC6MeZlzLT2WYNqyJstw2Ft8Dmmzxr9w2plCKud yRDvKu4FxZ/Ot5miMA4IqMEGXEym6rD9OWNDMqevaZZ0CG5aEJAXiUw6mmj4/rM9 bgDxt/iCZJqePipmmaP6unOepXlSKm2SKUfruS8U2y0Z/U8RBB8t3ToNI2XXiuks a/x49+8fOa3HDsZ2RQgq+/FhM2Al3Vkp2QxaRfRrLeL1KiSMFLyK1x72RLpUVlpo SElFTXRuxYMC/wfhdJzGG3udN5PvAFUGrgrkjfjgfM6Zd6t3R1r4ncvKBu2ZikCO zE1k6Wq05EkprKKZZvNrIjguqjU19FouQyGjq7uVyrtx3udBd792OZKz7JLxhgkS gZ5IY7VmxC8AyBrnRfoeo9SPkJ9z5+18As5HHwKJNTe/oyquPx0deXcpyQ3lgVl8 qkwSZopu2I30znc1OR6o =luP4 -----END PGP SIGNATURE-----
On 1/17/16, Michael Best <themikebest@gmail.com> wrote:
I spoke to someone from the list privately and did a little digging - it looks like the files can't be downloaded or saved electronically, at all, period. They have to be printed instead, from one of four computers which are located in College Park, MD (transparency!). So now I'm working on a plan to print and scan files from CREST. Given the scale (11+ million pages)...
it's almost like they want to claim transparency in empty words, and frustrate it at all points through deterrent actions... hmm!
I plan to be down there on Tuesday; I'll get more information and solid numbers (document counts, print speeds, etc.) and see where things are from there. Keep your fingers crossed, I may actually be able to get something.
can you bring in a raspberry pi with battery pack to act as a printer on the local network and then "print" to a 200G microSD slotted in it? :P #OpCIADox
On 1/16/16, Michael Best <themikebest@gmail.com> wrote:
Anyone can access it, but only from NARA II at College Park, MD.
If anyone can access the database from a laptop on the guest wireless then something like https://www.att.com/devices/netgear/beam.html would give the rest of the world access. But if access is allowed from only a NARA computer you're pretty much SOL..
I'm there on a semi-regular basis and want to figure out a way to efficiently liberate the docs that are only accessible from there....because that's some weird-ass censorship.
Not at all weird; it seems pretty effective to me. Over 11 million pages "released" and only 10% _printed_ (as in someone would have to scan the page to get it back into electronic format) sounds like a pretty good access control system for something you don't really want to make public. Regards, Lee
On Sat, Jan 16, 2016 at 7:05 PM, Martin Becze <mjbecze@gmail.com> wrote:
Can anyone access the CREST db? Do I need any specail permission?
On Sat, Jan 16, 2016 at 2:55 PM, Michael Best <themikebest@gmail.com> wrote:
I'm spending a fair amount of time at NARA II. Any thoughts on how I might be able to automate retrieval of documents from the CREST database? There are about 10 million pages of CIA docs that haven't even been accessed, much less made their way online.
Since 2000, CIA has installed and maintained an electronic full-text searchable system, which it has named CREST (the CIA Records Search Tool), at NARA II in College Park, Maryland. *Over 11 million pages have been released in electronic format and reside on the CREST database, from which researchers have printed about 1.1 million pages. **In order to directly access CREST, a researcher must visit the National Archives at College Park, Maryland.* http://www.foia.cia.gov/collection/crest-25-year-program-archive
Not at all weird; it seems pretty effective to me. Over 11 million pages "released" and only 10% _printed_ (as in someone would have to scan the page to get it back into electronic format) sounds like a pretty good access control system for something you don't really want to make public.
Unfortunately you're right. The weird part to me is the way it's justified and presented as a transparency tool! Double speak never dies, I suppose. =( On Sun, Jan 17, 2016 at 2:50 PM, Lee <ler762@gmail.com> wrote:
On 1/16/16, Michael Best <themikebest@gmail.com> wrote:
Anyone can access it, but only from NARA II at College Park, MD.
If anyone can access the database from a laptop on the guest wireless then something like https://www.att.com/devices/netgear/beam.html would give the rest of the world access. But if access is allowed from only a NARA computer you're pretty much SOL..
I'm there on a semi-regular basis and want to figure out a way to efficiently liberate the docs that are only accessible from there....because that's some weird-ass censorship.
Not at all weird; it seems pretty effective to me. Over 11 million pages "released" and only 10% _printed_ (as in someone would have to scan the page to get it back into electronic format) sounds like a pretty good access control system for something you don't really want to make public.
Regards, Lee
On Sat, Jan 16, 2016 at 7:05 PM, Martin Becze <mjbecze@gmail.com> wrote:
Can anyone access the CREST db? Do I need any specail permission?
On Sat, Jan 16, 2016 at 2:55 PM, Michael Best <themikebest@gmail.com> wrote:
I'm spending a fair amount of time at NARA II. Any thoughts on how I might be able to automate retrieval of documents from the CREST database? There are about 10 million pages of CIA docs that haven't even been accessed, much less made their way online.
Since 2000, CIA has installed and maintained an electronic full-text searchable system, which it has named CREST (the CIA Records Search Tool), at NARA II in College Park, Maryland. *Over 11 million pages have been released in electronic format and reside on the CREST database, from which researchers have printed about 1.1 million pages. **In order to
directly
access CREST, a researcher must visit the National Archives at College Park, Maryland.* http://www.foia.cia.gov/collection/crest-25-year-program-archive
On 1/17/16, coderman <coderman@gmail.com> wrote:
On 1/17/16, Michael Best <themikebest@gmail.com> wrote:
... Unfortunately you're right. The weird part to me is the way it's justified and presented as a transparency tool! Double speak never dies, I suppose. =(
"Better than nuthin', see?"
We're obeying the law, see?
On Sun, 2016-01-17 at 14:52 -0500, Michael Best wrote:
Unfortunately you're right. The weird part to me is the way it's justified and presented as a transparency tool! Double speak never dies, I suppose. =(
Yeah, it's about as transparent as Dr Pepper (the soft drink itself, not the company). How much does it cost to print each page? I'm pretty sure they would charge at least 10¢ per page, plus tax (that's about what FedEx Office and our local libraries charge). That doesn't sound too bad, until you realize 1,000 pages costs $100 to print. And then there's the weight of 1,000 printed pages... -- Shawn K. Quinn <skquinn@rushpost.com>
On 1/17/16, Michael Best <themikebest@gmail.com> wrote:
Not at all weird; it seems pretty effective to me. Over 11 million pages "released" and only 10% _printed_ (as in someone would have to scan the page to get it back into electronic format) sounds like a pretty good access control system for something you don't really want to make public.
Unfortunately you're right. The weird part to me is the way it's justified and presented as a transparency tool! Double speak never dies, I suppose. =(
Where is it presented as a transparency tool? They seem to be real clear about what they're doing: "In order to directly access CREST, a researcher must visit the National Archives at College Park, Maryland. CIA recognizes that such visits may be inconvienent and present an obstacle to many researchers." Regards, Lee
I have the beginnings of a plan for this; I'll share more as soon as I double check a few things and run some numbers. I shall call it... Project OPERATION. On Sun, Jan 17, 2016 at 3:26 PM, Lee <ler762@gmail.com> wrote:
On 1/17/16, Michael Best <themikebest@gmail.com> wrote:
Not at all weird; it seems pretty effective to me. Over 11 million pages "released" and only 10% _printed_ (as in someone would have to scan the page to get it back into electronic format) sounds like a pretty good access control system for something you don't really want to make public.
Unfortunately you're right. The weird part to me is the way it's justified and presented as a transparency tool! Double speak never dies, I suppose. =(
Where is it presented as a transparency tool? They seem to be real clear about what they're doing: "In order to directly access CREST, a researcher must visit the National Archives at College Park, Maryland. CIA recognizes that such visits may be inconvienent and present an obstacle to many researchers."
Regards, Lee
On Mon, Jan 18, 2016 at 11:58 AM, Michael Best <themikebest@gmail.com> wrote:
I have the beginnings of a plan for this; I'll share more as soon as I double check a few things and run some numbers.
I shall call it... Project OPERATION.
Mail them an 8TB drive in your foia for a digital copy of the dataset, they're only $225.
On Sat, Jan 16, 2016 at 7:07 PM, Michael Best <themikebest@gmail.com> wrote:
Anyone can access it, but only from NARA II at College Park, MD.
Probably not. Not unless you have a REAL-ID or passport to get past the goons theatre desk, submit yourself to logging, camera and cell surveillance, nightly janitorial DNA vacuuming, etc.
I'm there on a semi-regular basis
because that's some weird-ass censorship.
Some weird ass chilling. So you could tell us of that.
and want to figure out a way to efficiently liberate the docs that are only accessible from there....
GPIO driven keyboard and mouse, to camera screencap then OCR. General takeover of workstation.
participants (7)
-
coderman
-
grarpamp
-
Lee
-
Martin Becze
-
Michael Best
-
Shawn K. Quinn
-
The Doctor