petabyte on a budget

Thu Jul 8 07:07:58 PDT 2004

Since we were talking about how much storage is available these days:

For a backup just mirror to a number of similiar clusters offsite.

Large Scale Data Repository: Petabox


The petabox by the Internet Archive is a machine designed to safely store and
process one petabyte of information (a petabyte is a million gigabytes). The
goals-- and current design points are:
* Low power-- 6kWatts per rack, and 60kWatts for the whole system
* High density-- 100 Terabytes per rack
* Local computing to process the data-- 800 low-end PC's
* Multi-OS possible, linux standard
* Colocation friendly-- requires our own rack to get 100TB/rack, or 50TB in a
* standard rack
* Shipping container friendly-- Able to be run in a 20' by 8' by 8' shipping
* container
* Easy Maintenance-- one system administrator per petabyte
* Software to automate mirroring with itself
* Inexpensive design
* Inexpensive storage

PILOT STATUS       5/2004

* The first 100TB Rack is up and running!
* The second 100TB Rack will be up by the end of May
* Thermal Targets have been met
* Systems Bootstrapped from USB Flash Device
* Reiser FS running
* PC-based Router running


For more details, please contact:
