
alanh@infi.net (Alan Horowitz) asks:
Is there any open source - or otherwise - knowledge or speculation about which words/phrases the Terra-cycle cpu's are text-searching *for*? If it were your responsibility to eavesdrop on Iranian terrorists ... ... would you know for sure which words/phrases to key on? It doesn't sound like a tractable problem to me.
There are two parts to this question. First, how do you choose targets? It's been a few years since I read Bamford's book (when *is* the next edition coming out, anyway?) but I seem to remember that there is some sort of "committee" that agrees on what to aim at. It probably contains the usual collection of bureaucrats from military and civilian agencies. Despite the lofty budgets, even the NSA's vacuum cleaner has a hose of limited size. The committee allocates the available resources to prioritized objectives. You can probably predict targets by estimating political priorities and clout of the various agencies, and, of course, by watching CNN. Second, how do you ensure that you capture relevant traffic? I'm sure you start with the obvious -- look at traffic that's definitely relevant and set up filters to find more of it. While it's attractive to want to treat this as a state detection problem (message is/isn't relevant) you really want more of a signal analysis solution (measure likelihood of relevance). Also, a high priority target would probably have its hit behavior evaluated by real humans to ensure that the expected amount of relevant traffic is continuously sucked up. My first job, twenty years ago, involved a prototype speech recognition system under contract for Rome Air Development Center (a traditional cover for TLA research). The machine was supposed to go beep whenever a specified word or phrase was heard over the input voice stream. We joked about testing for the term "Russian Spy" but settled on looking for "Kissinger" instead. That let us run tests by listening to news broadcasts of the time. We built custom boards with Schottky TTL and a blazing fast 120ns cycle time. Times change, eh? Rick. smith@sctc.com secure computing corporation