Re: Dossiers and Customer Courtesy Cards

5 Jan 2003

      On Sat, 4 Jan 2003, Sunder wrote:
...
Not in any 1U system that I know of unless you mean multiple racks.
It doesn't matter. While NSA builds their own hardware, you can as well 
think in terms of vanilla Dells.
...
The biggest ATA drives I see on the market today are 200GB.  Most 1U
systems won't hold more than two of these.  That's nowhere near 1TB!
Dell 1U have three drive bays. Whether it's 200 GB, or 300 GByte or TByte
apiece (such drives exist) or how many U they occupy it doesn't matter, as
this is an order of magnitude estimate.
...
Also you're forgetting about doing backups; and I don't know about you,
I would not do backups for raw signint data. (If I would have to do
backups I'd use RAID at disk or mirrored servers). I would do backups for
targeted and destilled data, which is a tiny fraction of the entire sea of
nodes.
...
but I get a fuckload more email than 1K/day.  Granted, averaged out over
The point is not how much you're getting (personal mail, mailing lists are 
redundant, so is spam -- NSA could offer the best spam filtering sources 
ever), the point how much people are _writing_. On the average.
...
the entire population of the earth - what over 99% of don't even have
email, it may well be 1k/day/person.
Further, you'd want more than one GigE port on these machines just so as
to deal with the traffic.
Off-shelf Dells come with twin GBit Ethernet ports. You can throw in other
interconnects which scale better. The traffic is not that high, if you
remember that you can hold entire's day email traffic in your hand.
...
And you'll need lots of cage monkeys to run around replacing failed disks.
Do the math if the MTBF of one disk is 10,000 hours, what is the MTBF of
say 2 spindles (disks) per machines multiplied by 10000 machines? One
failure every 5 hours?   Hell, that's even assuming MTBF is that high!
How much cage monkeys do you need to deal with a hardware failures in a 10
kNode installation which happens a few time in a workday? One. Two, if you 
want to deal with the failure immediately. You should look at personnel 
requirements and failure rate for COTS clusters in academic environments.
...
Have you see: http://www17.tomshardware.com/column/200210141/index.html ?
You're probably also discounting the sheer amount of bandwidth required to
copy all that data, route it to each of those thousands of 1U nodes, and
Email and fax and telex are easy. Voice might be tight. Dunno about 
videoteleconferencing, not many people are doing it yet.
...
then analyze it near real time and provide the ability to search through
the results.  Oh, You'd need several such centers since the worlds data
flows aren't centralized.
The system I mentioned was an illustration that you can process entire
world's traffic in a single, not very large hall. On a very unremarkable
budget. Yes, you can centralize the world's traffic (a TBit/s fiber link
can feed one kilonode), but you don't have to. You can just switch the
individual clusters into a grid with dedicated fiber lines, and treat it
like a whole. It's just a database, after all.
...
I wonder what the specs are for those nice Echelon centers already in
existence....  Likely they're very different from what you propose.
I have no idea what the specs are. All I'm saying that it can be done,
now. The capabilities grow a lot faster than the number of subjects to
survey, though threshold countries coming online (see DSL numbers for
Chinese users) will result in a sudden surge of growth. After the world 
has saturated new growth will only come from intermachine traffic and new 
forms of communication (broadband video feeds).

One would hope that by then the bulk of that traffic is encrypted, making 
gathering data largely an exercise in futility.

Eugen Leitl

tags

participants (1)