XKeyscore rules - technology utilized
Based on the xkeyscore rules does anyone have some idea of the technology being utilized? Looking at the mapreduce::plugin definition I get the impression Hadoop is in use. Hadoop provides a stream interface for Map Reduce functions letting one utilize any program or language of their choosing [1-example]. Can with more knowledge of distributed data technologies confirm this? 1. http://cs.smith.edu/dftwiki/index.php/Hadoop_Tutorial_2.2_--_Running_C++_Pro... see also slide 5: http://cecs.wright.edu/~tkprasad/courses/cs707/ProgrammingHadoop.pdf <quote> cat input | grep | sort | unique -c | cat > output Input | Map | Shuffle & Sort | Reduce | Output </quote>
Another question: How much traffic they are monitoring with these definitions? All visible? Almost all except the u$a? I suspect the above will require quite large hardware. I suppose this is a matter of importance for the dear NSA. To paraphrase a Susan Sontag quote [1] ``Most people in this society who aren't actively terrorists are, at best, reformed or potential terrorists.'' [1] http://thinkexist.com/quotation/most_people_in_this_society_who_aren-t_activ... On Fri, Jul 04, 2014 at 04:38:20PM +0200, Nathan Andrew Fain wrote:
Based on the xkeyscore rules does anyone have some idea of the technology being utilized?
Looking at the mapreduce::plugin definition I get the impression Hadoop is in use. Hadoop provides a stream interface for Map Reduce functions letting one utilize any program or language of their choosing [1-example]. Can with more knowledge of distributed data technologies confirm this?
1. http://cs.smith.edu/dftwiki/index.php/Hadoop_Tutorial_2.2_--_Running_C++_Pro... see also slide 5: http://cecs.wright.edu/~tkprasad/courses/cs707/ProgrammingHadoop.pdf <quote> cat input | grep | sort | unique -c | cat > output Input | Map | Shuffle & Sort | Reduce | Output </quote>
Dnia piątek, 4 lipca 2014 19:07:05 Georgi Guninski pisze:
Another question:
How much traffic they are monitoring with these definitions? All visible? Almost all except the u$a?
Well, some definitions contain Five Eyes country codes as negative matching rules (i.e. IPs from Five Eyes countries will *not* get matched), others do not have this condition. I find this very surprising as it suggests that Five Eyes and other exclusion rules are possibly defined on a per-fingerprint basis; I would have thought these would rather be implemented somewhere higher-up (i.e. some post- processing/post-filtering) so that IPs from Five Eyes don't get accidentally snatched due to somebody forgetting to include the rule in their fingerprint. On the other hand, I guess it can also be the other way around: NSA doesn't give a flying fsck about Five Eyes and the policy is to "grab everything, nobody will know anyway"; the "do not include Five Eyes IPs" rule in one of the fingerprints would then be an overzealous technician including it in the fingerprint because they thought they should ("we don't spy on our friends", etc). Fun stuff either way. -- Pozdr rysiek
On Fri, Jul 4, 2014, at 04:38 PM, Nathan Andrew Fain wrote:
Based on the xkeyscore rules does anyone have some idea of the technology being utilized?
Looking at the mapreduce::plugin definition I get the impression Hadoop is in use. Hadoop provides a stream interface for Map Reduce functions letting one utilize any program or language of their choosing [1-example]. Can with more knowledge of distributed data technologies confirm this?
It's been known for a while that the NSA are using Hadoop (June 9, 2013)[1]: "The NSA's advances have come in the form of programs developed on the West Coast—a central one was known by the quirky name Hadoop—that enable intelligence agencies to cheaply amplify computing power, U.S. and industry officials said." Also, from the Hadoop 2014 speaker lineup [2]: "Joey Echeverria is Cloudera`s Chief Architect for Public Sector where he coordinates with Cloudera`s Customers and Partners as well as Cloudera`s Product, Engineering, and Field teams to speed up the time it takes to move Hadoop applications to production. Previously Joey was a Principal Solutions Architect where he worked directly with customers to deploy production Hadoop clusters and solve a diverse range of business and technical problems. Joey joined Cloudera from the NSA where he worked on data mining, network security, and clustered data processing using Hadoop." Alfie [1] http://online.wsj.com/news/articles/SB10001424127887323495604578535290627442964?mg=reno64-wsj&url=http%3A%2F%2Fonline.wsj.com%2Farticle%2FSB10001424127887323495604578535290627442964.html [2] http://hadoopsummit.org/san-jose/speakers/ -- Alfie John alfiej@fastmail.fm
On Sun, Jul 6, 2014 at 11:38 PM, Alfie John <alfiej@fastmail.fm> wrote:
... It's been known for a while that the NSA are using Hadoop (June 9, 2013)[1]:
"The NSA's advances have come in the form of programs developed on the West Coast—a central one was known by the quirky name Hadoop—that enable intelligence agencies to cheaply amplify computing power, U.S. and industry officials said."
Hadoop made the Utah data center ;) [ now if only all that computing was actually performed in the public interest ... ]
participants (5)
-
Alfie John
-
coderman
-
Georgi Guninski
-
Nathan Andrew Fain
-
rysiek