Protocols for a Data Bank The purpose of a data bank is to store large bodies of information for long periods of time. I suggest here some protocols and contracts for a data bank and its customers. We then discuss risks, incentives and stratification of the data storage industry. These ideas implicitly rely on several types of cryptography-- public key, secure hash, symmetric ciphers and blind signatures. To explain these technologies here would substantially obscure the presentation for those who know of such things and help very little for those who don't. Here are several transactions that a data bank engages in. Acquire data: A client anonymously sends a collection of data along with funds sufficient to warrant the bank's holding the data for a few days and computing its secure hash. The bank knows the data only by its secure hash. Selling Hat Checks: The bank will sell a hat check to anyone who will pay a negotiated price. The hat check specifies the secure hash of the data, the penalty to be paid upon failure to produce the data, and the cost of redeeming the data. The hat check is signed blindly by the bank and is a bearer instrument. Any holder of a hat check can present the check along with the redemption fee and demand the data. The data bank must then either produce the data or pay the penalty to the holder of the hat check. A particular hat check is canceled whenever the bank pays the penalty like a spent Chaum DigiCash note. The bank can sell multiple hat checks for the same data. Different hat checks for the same data may specify different penalties. Sell a copy of an acquisition: Any one can request a piece of data identified only by its secure hash. The bank is free to sell a copy of the data to anyone with the secure hash. The bank sets the price. Publish index: The bank can publish its list of hashes. (This makes data hunters possible.) Cancel a hat check: A holder of a hat check may sell it back to the bank at a negotiated price thus releasing the bank from the threat of paying a penalty in the future. This also allows the bank to retrieve the physical storage where the data is stored if it is sure that it has not sold other hat checks for the data. The hat check may specify expiration dates, cancellation terms etc. The bank is explicitly permitted to disseminate the data and may well do so to lay-off risks. In this sense a data bank is like in insurance company that spreads and shares risks. A hat check may be viewed as a life insurance policy for the data. Risks Dividing trust may be done by agreeing on a notary. Upon redemption, for instance, a trusted notary might examine the hat check, accept the payment specified therein from the client, pass over the data on its way from the bank to the client while computing the secure hash, and if the secure hash matches that in the hat check, deliver the payment to the bank. The notary need not have long term financial stability as must the bank. Brokers may have an interface similar to a bank. They return baskets of hat checks. This reduces the risk to the client that one of the data banks will fail financially and be unable to pay the penalty. The broker need not be financially stable. Data Hunters can engage in knowing who has what data. Given a hash they can tell you what banks have the data. This would presumably require a new protocol with the bank. This might be the ultimate URL or URI server. Inflation can damage incentives. Hat checks might be denominated in gold or currency baskets or what ever. RSA modulus size is critical for long term contacts. 2K bits of modulus or more may be warranted. Example I can imagine the Getty Museum digitizing its Rembrandts and storing the results in a data bank. The data might be insured for $100,000,000. The bank would disseminate the data to increase security and lower its risk. The museum would probably encrypt the data and share the key and hash ala Shamir for safe keeping. The museum would not share the hat check because it wants to be the one paid upon default. Incentives A data bank, or any other player, may find that keeping data profitable beyond the point of any outstanding hat checks. It can make money by supplying copies of the data in return for a fee plus secure hash. Indeed new hat checks may be sold after the last had expired. Data banks thus have an incentive to disseminate their list of holdings in the form of hashes, as input to bounty hunters. Design Considerations It may seem strange that the data bank does is willing to sell data to who ever will pay. I suggest that because it is so easy to encipher the data and not have to trust the bank. You can distribute the key thru what ever channels you transmit the secure hash of the data. Note that bank clients are always anonymous. Data is never held for some known person. Data may be held solely for speculation. The purpose of the penalty is to motivate the bank to keep data for which there is no reason to forecast sales revenue. The Bank's State Logically the bank can perform all of these transactions by merely keeping the unordered set of acquisitions. It is practically necessary to index these by their secure hash but this can be rebuilt on demand. It must keep canceled hat checks lest it become liable to extra penalties. The bank need not keep records of hat checks that it has sold unless it wants to know when it can delete acquisitions. It may want to keep marketing information to know when acquisitions are worth keeping merely to sell copies of.