I was exploring the concept of a "data haven" which, to my knowledge, a place whose location is unknown to its users, but via anonymous remailers, files can be stored and retrived from it. This is certainly on-topic. As stated, however, the outline suffers badly froma confusion of purpose. It is not necessary to solve every problem that can be thought of, merely to solve the most important problem in such a way that allows it to be combined with other known solutions. Specifically, the proposal worries far too much about communications security and routing issues, which best go elsewhere in the abstraction. The main service proposed is data storage, not anonymous remailing. Remailing can be done with other segments. Secondly, such storage need not be tied to identity. There's no need for passwords or passphrases or even public keys. The main idea here is storage. You want the property that arbitrary people can't scan the storage facility for content, but identity, while it would work, is _more_ than is necessary. (Can anybody anticipate the solution? See below.) 2: One must have to "hide" behind a VERY TRUSTABLE remailer, [...] This is a concern about communications, and is not necessary to the main idea of remote archiving. 4: A need for verifing that the mail got to the DH successfully since data errors do occur, and sometimes networks truncate mail packets. Again, this communication issue should be dealt with in a separate layer that is concerned about the reliability of communications. 5: A way of making verifing that the user is who (s)he claims to be. Identity-based retrieval is possible, but it's not necessary. Since the service is single purpose (storage) and won't be dealt with directly by humans, i.e. no command prompt, but rather will act as a back end for some retrieval process, the persistence of identity isn't required at the back end. Some persistence will certainly be useful, but it can occur at the user's end. 6: Multiple security levels, so files cannot be retrived even if one's PGP key is compromised (user settable) This is really overkill. Every bit of complication makes the code harder to design, harder to write, harder to debug, and harder to deploy. A simple solution with the basic function can later be elaborated upon. 8: There will need to be a way to tell if the DH is up or not. If you make a request, and nothing comes back, it's not up. I don't see the value in extra functionality. 9: How will PGP keys be stored and indexed? Again, this issue can be finessed. At least part of the issue is a communications one as well, which is best dealt with elsewhere. 10: How would people be able to trust a DH? If you store only encrypted data--and only the stupid would not--the only bit of trust is in continued uptime. Replication and redundancy can be handled at the user's end. At some point _every_ replication bottoms out to the unreplicated storage of some bit of data. This is the primitive, and this deserves to get implemented first. 11: How would a DH turn away files because the disk is full? Silent failure should work just fine. Disk space limitations are just as difficult to deal with as communication failures. 12: Would integrating DigiDollars with a DH be a good idea? At some point when they exist, yes. Right now, without such mechanisms, requiring this will prevent any deployment. I apologize for the length of this post, but there are a lot of questions and problems in making a stable, usable data haven. Looking to implement the final goal as a first project is doomed to failure. Implementing a simple primitive as an attainable project is a much better idea. Now for some specifics. There is a package called Almanac which is a file-by-mail server. Leveraging off this code is a good place to start. Lots of the basic issues are already solved. Now, about authentication. The basic service is storage. It's not even providing name access to the storage. The data itself is what is desired, and a cryptographic one-way hash function suffices as a name. Knowledge of the hashcode provides all the authentication that is needed. If you don't know the hashcode, you can't get the file. If you do know the hashcode, you can. No one else can guess the hashcode, and since no one else knows these hashcodes, the hashcodes suffice as a replacement for the presistence of identity. Furthermore, the many files stored by a particular individual are not linked together in any way on the remote site. The storage site need not have this data; in fact even having this data introduces another security risk. The software on the user end can keep track of any mapping desired. Some sort of tracking software on the user end will be needed in any case to keep track of what is stored where; it may as well keep track of a remote name mapping. So the primitives to implement are very simple; there are two: "store text T" and "retrieve the text with hashcode N". Perhaps a third is also desired: "is text with hashcode N present?". This kind of system is very simple. For implementation of the back end, the files can be stored with filenames which are hexadecimal representations of their hashcodes. This representation allows one to leverage the existing index structure of the file system, avoiding the need to code one inside the application. For the front end, a log file will suffice for a trial version of name mapping. The retrieval method is "grep by hand". Something more advanced can be implemented later, perhaps something that looks like a file system or an ftp site. Eric