This is a re-post of an earlier message where I accidently wrote "nntp" in place of "http". I have added some more material, too. Please ignore the earlier message, and thanks to those who pointed out the mistake. We have had some discussions here about privacy of accesses on the World Wide Web. Presently servers get a variable amount of information about the people accessing their sites, depending on the particular software being used and how it is configured. This is potentially harmful to the privacy of WWW users in that their access information can be recorded, etc. Far from being a hypothetical concern, I believe many companies are collecting this information and using it to build up possible future email mailing lists, etc. I spoke recently with someone who is designing enhanced server software for the web. Their system will keep all kinds of statistics about who accesses which pages on the server, correlating that with which people request information on the products being sold. We have also seen how even too-cool Wired magazine is demanding user names to allow access to their pages. (Remember: username cypherpunk, password cypherpunk.) Here are some things you can do to reduce this problem. First, to see how bad the problem is for you, try connecting to: http://www.uiuc.edu/cgi-bin/printenv This just displays environment variables, which shows what information about you is being received by servers. Look particularly at the lines reading HTTP_FROM and REMOTE_HOST. These may contain your user name and computer address. You may be able to remove your user name information. Some clients, including, I am told, NetScape and version 2 of Mosaic for Mac/Windows, allow you to set your email address, which is handy, but then they send it along to servers, which is harmful to your privacy. You might want to consider not setting this field and using other programs for sending mail. Also if people complain about this then perhaps the makers of this software will add an option to suppress sending the info. Even if you don't see your name in HTTP_FROM it still may be possible for somewhat more sophisticated programs to log your access if the REMOTE_HOST information is correct and you are running on a Unix system or something similar. This is done via the identd service if that is running on your computer. The server can use this service to ask for your user name once you are connected. One way to see if identd is running on your computer is to telnet to your own computer on port 113 and see if anything is there (telnet <your-computer-name> 113). If so then this is potentially another privacy exposure. I have recently been experimenting with using "proxy servers" to remove even the REMOTE_HOST information from the server's view. Proxy servers are servers which basically receive WWW connections and pass them along. Then when the data comes from the remote site they pass it back to the originating user's site. Because the proxy server is in the middle the remote site never sees the host name of the originating user. In this respect they are somewhat similar to our cypherpunk remailers, hence the title of this article. (The purpose of proxy servers has nothing to do with this function; they are designed to allow easy WWW access from users who are on firewalled sites. But they happen to serve our purposes as well.) Interestingly, the standard httpd (http daemon, the master server which runs on a site which offers web pages) from CERN includes proxying capability automatically! All you have to do is to add four lines to the configuration file. (See the URLs below for more info.) If this idea proves sound, perhaps some cypherpunks running httpd will enable proxies and serve as "remailer operators of the web". Normally proxy servers are configured to pass connections only from the machines they are there to serve (at least, they can be configured that way; I don't actually know how careful people are about this). But luckily I have found that the CERN proxy server itself accepts connections from anybody (at least, it accepts them from me!). So this is useful for doing experiments. And, the great part is, almost all web clients are set up now for proxy support. The way you enable it varies from client to client. I believe most of the Mac and Windows clients have a preferences box which allows you to put in the address of your proxy server. On Unix, you can set environment variables. Here is the suggestion from the web page at CERN: #!/bin/sh http_proxy="http://www.cern.ch:911/"; export http_proxy ftp_proxy="http://www.cern.ch:911/"; export ftp_proxy gopher_proxy="http://www.cern.ch:911/"; export gopher_proxy wais_proxy="http://www.cern.ch:911/"; export wais_proxy exec Mosaic This is a little shell script which runs Mosaic, first setting four environment variables to "http://www.cern.ch:911/", which is the proxy server I was referring to, the one which accepts connections from the rest of the world. For the purpose of the experiment, only http_proxy needs to be set. Try setting that one and then run lynx or mosaic on your unix workstation, and connect to the printenv URL above. Compare the information that is shown from what you got earlier without the environment variable. Similarly, on other machines, try the printenv test with and without proxy serving enabled using the CERN proxy. I find that the proxy server does in fact prevent the remote site from seeing my computer's address, and without that the IDENTD can't be used to reveal my name. This technique has many ramifications. For example, if a US proxy server were available, ftp could be done via Mosaic to sites which only allowed connections from American computers. People have been talking about writing special IP redirectors for this, but here it turns out the capability has been around all along. Can anyone supply addresses of additional proxy servers to try? I had an idea about how to find them. Many web servers log accesses. By searching those access logs it might be possible to find proxy sites. The server is given information about whether a proxy is used, as well. This shows up in the HTTP_USER_AGENT environment variable on the printenv page. Servers could look for references to proxies in that data and collect proxy addresses in that way. There is a nice irony in using server logging to collect data that would allow users to defeat much server logging. I got my information about proxies by reading: http://info.cern.ch/hypertext/WWW/Proxies/. Specific information on configuring CERN httpd as a proxy server is in: http://info.cern.ch/hypertext/WWW/Daemon/User/Proxies/Proxies.html. Modifications to the proxy server code would be necessary to provide some additional features, such as support of encryption between user and proxy server (via the SHTTP protocol extensions, perhaps; this way you could get local privacy even when connecting to servers which did not support encryption), or possibly chaining of proxies. I think this is a fertile area for discussion and further work. Hal