XKeyScore code authenticity - genuine [was: messing with XKeyScore]
the theme of messing with XKeyScore is amusing[0], but more to the point i was asked to respond to some concerns of authenticity made in a different post: "Validating XKeyScore code" http://blog.erratasec.com/2014/07/validating-xkeyscore-code.html i'm trying to keep this feedback technical, as i don't like much of Graham's reasoning. (i do however approve of his use of "Great Man" in the Voldemort sense, in reference to Cowboy Alexander[1]) his claim that "we believe the code partly fake and that it came from the Snowden treasure trove." should be ammended: "we believe the code deprecated, and that it came from the Snowden archives" onward! --- first segment of summary, by point: # Point 1) "The signatures are old (2011 to 2012), so it fits within the Snowden timeframe, and is unlikely to be a recent leak." - agreed. # Point 2) "The code is weird, as if they are snippets combined from training manuals rather than operational code. That would mean it is “fake”." - false; the code is valid and deprecated (can be used as example) rather than false. the technical detail. as a programmer, i know that a regexp rule like: ''' extractors: {{ bridges[] = /bridge\s([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}):?([0-9]{2,4}?[^0-9])/; }} ''' is both written by a novice regexp'er, and also took them a bit of time. more than they'd spend on an example. for another example, ''' for (size_t i=0; i < bridges.size(); ++i) { std::string address = bridges[i][0] + ":" + bridges[i][1]; DB[SCHEMA_OLD]["tor_bridge"] = address; DB.apply(); DB[SCHEMA_NEW]["tor_ip"] = bridges[i][0]; DB[SCHEMA_NEW]["tor_port_or"] = bridges[i][1]; DB[SCHEMA_NEW]["tor_flags"] = FLAGS; DB.apply(); } ''' why two commits here to backend changes? as a programmer i understand why this is done, but as a purely fictitious example the double commit is pointless noise. # Point 3) "The story makes claims about the source that are verifiably false, leading us to believe that they may have falsified the origin of this source code." - false how does limited misunderstanding arcane technicalities invalidate the entirety? if this were true, Robert Graham would be a complete imbecile, rather than technically competent and occasionally wrong. # Point 4) "The code is so domain specific that it probably is, in some fashion, related to real XKeyScore code – if fake, it's not completely so." - false. as stated above, these rules are deprecated rather than fictitiously constructed. (and perhaps referenced in training materials for utilizing the particular language engines demonstrated) as explained above, and i will go into more detail later (i wager i have more big data experience and DPI experience than Mr. Graham the DPI expert does in this domain alone[2] ;) last but not least, this speaks to the need for greater technical expertise to be applied to the leaked archives. if anything, the nature of domain specific details discussed here show that not just generalists, but an army of specialists, will ultimately be needed to properly parse and protect based upon the archives as yet revealed. best regards, --- 0. "" for those with "Jam Eschelon Day" nostalgia ;) ^- see whole thread from "messing with XKeyScore" 1. "The character assassination of Keith Alexander" '... People have criticized calling him a "great man". I'm quoting the Harry Potter movie here people, where the guy who sells Harry's[sic] wand points out that Voldermort was a great wizard,[sic] a great and terrible wizard' http://blog.erratasec.com/2014/06/the-character-assassination-of-keith.html 2. "XKeyScore: it's not attacking Tor" '... I am an expert in deep packet inspection (DPI). I've written a system vaguely similar to this XKeyScore system here: (ferret). I find the conclusions in this story completely unwarranted, though the technical information cited by this story is pretty good. I suggest future stories about the NSA's deep packet inspection actually consult with engineers who've written DPI code before making wild claims.' http://blog.erratasec.com/2014/07/xkeyscore-its-not-attacking-tor.html
On Sun, Jul 6, 2014 at 7:11 PM, coderman <coderman@gmail.com> wrote:
... a regexp rule like: ''' extractors: {{ bridges[] = /bridge\s([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}):?([0-9]{2,4}?[^0-9])/; }} ''' is both written by a novice regexp'er, and also took them a bit of time. more than they'd spend on an example.
i should have clarified this statement. this is code someone wrote to get a job done. they are pulling bridge addresses out of text (email bodies?) and getting a job done. this is fine code and similar to what you'd see in any production environment. this is not what a regexp guru would use to show their ability to tightly match with sparse efficiency. it is also not so simple that a non-PCRE fluent person would use it as a fictitious example. to be clear: all signs point to this being real code a person wrote to get a job done - parse out bridge addresses from text. the signs point toward this code being legitimate depricated code, even if not currently useful. the code do not point toward this being a non-fictitious example, and it seems Robert even alludes to as much with. "One interesting thing to note about the port number is that it captures the first non-digit character after the number as well. This is obvious[sic] a bug, but since it's usually whitespace, one that doesn't impact the system." - implying he believes this is a legitimate rule, and also not written by an expert. best regards,
On Sun, Jul 6, 2014 at 7:30 PM, coderman <coderman@gmail.com> wrote:
... the code do not point toward this being a non-fictitious example,
i meant "non-functional, fictitious example" of course. and with that, i will leave my further comments to a later, more sober date... airport security, here i come! :P best regards,
participants (1)
-
coderman