On Sat, 4 Jan 2003, Sunder wrote:
Not in any 1U system that I know of unless you mean multiple racks.
It doesn't matter. While NSA builds their own hardware, you can as well think in terms of vanilla Dells.
The biggest ATA drives I see on the market today are 200GB. Most 1U systems won't hold more than two of these. That's nowhere near 1TB!
Dell 1U have three drive bays. Whether it's 200 GB, or 300 GByte or TByte apiece (such drives exist) or how many U they occupy it doesn't matter, as this is an order of magnitude estimate.
Also you're forgetting about doing backups; and I don't know about you,
I would not do backups for raw signint data. (If I would have to do backups I'd use RAID at disk or mirrored servers). I would do backups for targeted and destilled data, which is a tiny fraction of the entire sea of nodes.
but I get a fuckload more email than 1K/day. Granted, averaged out over
The point is not how much you're getting (personal mail, mailing lists are redundant, so is spam -- NSA could offer the best spam filtering sources ever), the point how much people are _writing_. On the average.
the entire population of the earth - what over 99% of don't even have email, it may well be 1k/day/person.
Further, you'd want more than one GigE port on these machines just so as to deal with the traffic.
Off-shelf Dells come with twin GBit Ethernet ports. You can throw in other interconnects which scale better. The traffic is not that high, if you remember that you can hold entire's day email traffic in your hand.
And you'll need lots of cage monkeys to run around replacing failed disks. Do the math if the MTBF of one disk is 10,000 hours, what is the MTBF of say 2 spindles (disks) per machines multiplied by 10000 machines? One failure every 5 hours? Hell, that's even assuming MTBF is that high!
How much cage monkeys do you need to deal with a hardware failures in a 10 kNode installation which happens a few time in a workday? One. Two, if you want to deal with the failure immediately. You should look at personnel requirements and failure rate for COTS clusters in academic environments.
Have you see: http://www17.tomshardware.com/column/200210141/index.html ?
You're probably also discounting the sheer amount of bandwidth required to copy all that data, route it to each of those thousands of 1U nodes, and
Email and fax and telex are easy. Voice might be tight. Dunno about videoteleconferencing, not many people are doing it yet.
then analyze it near real time and provide the ability to search through the results. Oh, You'd need several such centers since the worlds data flows aren't centralized.
The system I mentioned was an illustration that you can process entire world's traffic in a single, not very large hall. On a very unremarkable budget. Yes, you can centralize the world's traffic (a TBit/s fiber link can feed one kilonode), but you don't have to. You can just switch the individual clusters into a grid with dedicated fiber lines, and treat it like a whole. It's just a database, after all.
I wonder what the specs are for those nice Echelon centers already in existence.... Likely they're very different from what you propose.
I have no idea what the specs are. All I'm saying that it can be done, now. The capabilities grow a lot faster than the number of subjects to survey, though threshold countries coming online (see DSL numbers for Chinese users) will result in a sudden surge of growth. After the world has saturated new growth will only come from intermachine traffic and new forms of communication (broadband video feeds). One would hope that by then the bulk of that traffic is encrypted, making gathering data largely an exercise in futility.