
My studies show about 1 chance in 20 that any particular running remailer will drop a message. This is based on sending a bunch of messages through chains of remailers of varying length and seeing how many arrive in a day or so, and then calculating the average probability. These are not the lame remailers, either, these are the cream of the crop, based on various performance reports. Note that in a chain of twenty remailers this causes many messages to be dropped. Even if we achieve a reliability rate of 1 in 100,000, this means about 1 in 5000 messages will be lost through a chain of 20 remailers - usable. For some reason remailer culture has accepted message drops as normal. They aren't normal. There isn't any reason why a remailer should be any less reliable than a mail server. I estimate mail servers are far more reliable than 1 loss in 100,000 messages. One possibility is that some of my tests went through remailers that are just down. This hasn't been ruled out beyond any reasonable doubt, but it appears that the drops are occuring in remailers which are still carrying traffic. I did an extensive series of tests with an internal mixmaster network running Mixmaster 2.0.3. I had to fool around with it, but I was able to run many hundreds of messages through chains of 20 hops without a single message drop. The core software appears to be solid (unless bugs have been introduced into more recent releases ;-), but the setup needs to be tweaked. The practice of putting something like this into .forward is a problem: "|/usr9/loki/Devel/Mix/mixmaster -R >> /usr/spool/mail/username" If too many messages come in at once this brings a machine to its knees because it runs too many mixmaster processes at once. This feeds on itself because as more processes run, it takes longer for each one to complete. There may also be a bug in mixmaster which is only expressed when many mixmaster processes simultaneously try to queue a message. (FYI, this also creates an easy DoS attack on a mixmaster node. Just send the node a bunch of messages which loop back into itself over and over again. This should rapidly cause sendmail to spawn many mixmaster processes and lock up the machine. I haven't tried this on anybody else's mixmaster node.) The solution is to have a small program which puts incoming messages into a spool file. I've enclosed the perl script I used, which comes with no guarantees and hasn't been tested all that much. The script copies the message into a file in the spool directory starting with "temp". Then it moves it to a file starting with "spool". The reason for the move is to guarantee that all "spool" files are complete messages. The time of creation is encoded in the filename so a processing program can get the files in order of arrival. The other pieces needed are a small perl script which monitors the spool directory and calls mixmaster to queue up each file and another process which calls mixmaster repeatedly to process the queued messages. This setup causes no more than two mixmaster processes to run at a time and is more robust. Would somebody mind polling the remailer operators to see how many are running mixmaster directly from .forward? I suspect this is most of the problem. If it isn't, we need to eliminate it before pursuing the next suspect. ---cut here------cut here------cut here------cut here------cut here------cut here--- #!/usr/bin/perl -w # This code is in the public domain. # Put in a .forward file. Spools mail into a directory for processing # by something else. require 5; use strict; use English; my(@pwvals) = getpwuid($UID); my($homedir) = $pwvals[7]; my($spooldir) = "$homedir/spool"; die "Home directory doesn't exist." if ! -e $homedir; die "Home directory isn't a directory." if ! -d $homedir; die "$spooldir doesn't exist." if ! -e $spooldir; die "$spooldir is not a directory." if ! -d $spooldir; my($now) = time; my($now_string); if( length($now) < 12 ) { $now_string = ('0' x (12 - length($now))) . $now; } my($temppath) = "$spooldir/temp.$now_string.$$"; die "Amazingly, $temppath already exists." if -e $temppath; open(SPOOLFILE, ">$temppath") or die "Could not open $temppath for writing.\n"; while(<>) { print SPOOLFILE $_; } close(SPOOLFILE); my($spoolpath) = "$spooldir/spool.$now_string.$$"; die "Amazingly, $spoolpath already exists." if -e $spoolpath; if( ! rename $temppath, $spoolpath ) { die "Could not rename $temppath to $spoolpath."; } # Let sendmail know everything is fine. exit(0);

On Thu, 9 Aug 2001, Nomen Nescio wrote:
My studies show about 1 chance in 20 that any particular running remailer will drop a message. This is based on sending a bunch of messages through chains of remailers of varying length and seeing how many arrive in a day or so, and then calculating the average probability. These are not the lame remailers, either, these are the cream of the crop, based on various performance reports.
Note that in a chain of twenty remailers this causes many messages to be dropped. Even if we achieve a reliability rate of 1 in 100,000, this means about 1 in 5000 messages will be lost through a chain of 20 remailers - usable.
For some reason remailer culture has accepted message drops as normal. They aren't normal. There isn't any reason why a remailer should be any less reliable than a mail server. I estimate mail servers are far more reliable than 1 loss in 100,000 messages.
Consider how few messages get lost in the CDR. The extra mix technology shouldn't inject any randomness in message processing. You should drop messages measured in the handful per year max. The next major question is to determine where the drops are happening. Inbound, outbound, inter-remailer, intra-remailer? One aspect of this, assuming the remailers are under attack and that is the hypothesis we are going to assume, is that we need to be able to inject traffic into the remailer stream anonymously. Otherwise Mallet get's wise to what is going on and starts playing us. If at all possible all measurements should be made anonymously and as stealthily as possible. Q: How to inject traffic into the remailer network anonymously? Q: How do we measure the input/output flow without collusion of the operator? Q: Where are the computing resources to munge resulting flood of data over at least a few weeks time period. How do we hide this 'extra' flow of data? It represents an opportunity for incidental monitoring due to load usage. Q: How do we munge the data? What are we trying to 'fit'? Q: Once we have the data and can (dis)prove the hypothesis, then what? -- ____________________________________________________________________ natsugusa ya...tsuwamonodomo ga...yume no ato summer grass...those mighty warriors'...dream-tracks Matsuo Basho The Armadillo Group ,::////;::-. James Choate Austin, Tx /:'///// ``::>/|/ ravage@ssz.com www.ssz.com .', |||| `/( e\ 512-451-7087 -====~~mm-'`-```-mm --'- --------------------------------------------------------------------

----- Original Message ----- From: "Jim Choate" <ravage@ssz.com> To: <cypherpunks@einstein.ssz.com> Sent: Wednesday, August 08, 2001 7:05 PM Subject: CDR: Re: Mixmaster Message Drops
The next major question is to determine where the drops are happening. Inbound, outbound, inter-remailer, intra-remailer?
That matters from a correction view but not from a usage view, which I assume we're taking. Basically we don't care what technology the remailer uses as long as it is correct technology and trustable. From there we care only what remailers are disfunctional and which are useful.
One aspect of this, assuming the remailers are under attack and that is the hypothesis we are going to assume, is that we need to be able to inject traffic into the remailer stream anonymously. Otherwise Mallet get's wise to what is going on and starts playing us.
Well assuming that the remailers are under attack, we start using digital signatures with initiation information stored in them. Mallet can introduce duplicates, but the likelihood of a duplicate being detected rises very quickly, (i.e at a rate of 1-(1/20)^M for M duplicate messages assuming a drop rate of 1 in 20). This gives us the ability to discount the vast majority of what Mallet does and get very close to accurate values. The bigger risk is for Mallet to identify our queries and force the proper functioning of the node exclusively for the query. Correcting this is much more difficult, but would only take the use of digital signatures and encryption on all the messages traversing the network. Since the remailer user inherently a more developed user than Joe (l)User this is much more reasonable. But still approaches impossible because the remailer users is a finite set so Mallet could store all the remailer user keys, and treat them differently from the query keys. This becomes extremely difficult as long term keys are defeated as well as ephemeral keys. Instead the remailer users will have to maintain statistics, or at least a large unknown portion of them. If users upload to say freenet once a month the number of anonymous messages they have sent and recieved (without mention of timeframe except implicitly month) we could get an overall droprate, and the users wouldn't have to reveal who they are.
If at all possible all measurements should be made anonymously and as stealthily as possible.
Agreed I was beginning to adress this above, it still has some major problems.
Q: How to inject traffic into the remailer network anonymously?
through a set of trusted remailers, if those remailers are trusted and are used for test initiation, then the exact droprate from that entry point will be known. This will build a reputation for those remailers making it desirable for trustable remailer operators to be in that set by increasing the number of messages, leading to better security by initiating from the trusted list.
Q: How do we measure the input/output flow without collusion of the operator?
You count the messages in and the messages out, you don't care what they say, where they're from etc, the operator doesn'tr even need to know you're doing it. Of course this is a rather difficult task, the better option would be to test the network as a whole, by colluding of users to collect statistics on their own messages going through, this would defeat much of what Mallet could do because the test messages would be real messages that are being propogated through.
Q: Where are the computing resources to munge resulting flood of data over at least a few weeks time period. How do we hide this 'extra' flow of data? It represents an opportunity for incidental monitoring due to load usage.
Wouldn't be that bad. Treating the network as a function of it's entry-point seems easiest. Then it's just a simple fraction which can be published raw or you can waste 4 seconds on a 1GHz machine and compute the values. Either way it's not compute intensive, most of the work needs to be done by legitimate users with legitimate messages (to prevent Mallet from playing with the messages).
Q: How do we munge the data? What are we trying to 'fit'?
We are trying to determine the best entry-point for anonymous remailer use as measured by percentage of messages that reach their destination, as filtered by being "trusted".
Q: Once we have the data and can (dis)prove the hypothesis, then what?
Then we only trust the servers on the "trusted" list, and we use the best remailer from the list in terms of delivery. This will encourage individuals that run worthless remailers to improve their systems, eventually leading to the dropping of only a handful of messages a year. Joe

On Wed, 8 Aug 2001, Joseph Ashwood wrote:
----- Original Message ----- From: "Jim Choate" <ravage@ssz.com> To: <cypherpunks@einstein.ssz.com> Sent: Wednesday, August 08, 2001 7:05 PM Subject: CDR: Re: Mixmaster Message Drops
The next major question is to determine where the drops are happening. Inbound, outbound, inter-remailer, intra-remailer?
That matters from a correction view but not from a usage view, which I assume we're taking. Basically we don't care what technology the remailer uses as long as it is correct technology and trustable. From there we care only what remailers are disfunctional and which are useful.
It matters from a resolution perspective as well. The fact is you claim 1/20 messages are being dropped (per remailer or per submission? Do you see any difference in drop rate dependent upon number of remailers, you should?). That's an ASTOUNDINGLY high figure. Several orders of magnitude over any expectation of 'normal' behaviour. This means that not only is the technology itself under question, but its implimentation as well. In short, with this level of failure you can't 'trust' anything.
Well assuming that the remailers are under attack, we start using digital signatures with initiation information stored in them. Mallet can introduce duplicates,
Duplicates are not drops, signatures do nothing for drops. You're changing the rules in the middle of the game.
Q: How to inject traffic into the remailer network anonymously?
through a set of trusted remailers,
Which we don't have if we accept your numbers. Depending on the technology you're trying to vet is a recipe for disaster (well Mallet won't think so). -- ____________________________________________________________________ natsugusa ya...tsuwamonodomo ga...yume no ato summer grass...those mighty warriors'...dream-tracks Matsuo Basho The Armadillo Group ,::////;::-. James Choate Austin, Tx /:'///// ``::>/|/ ravage@ssz.com www.ssz.com .', |||| `/( e\ 512-451-7087 -====~~mm-'`-```-mm --'- --------------------------------------------------------------------

First let me say those were not my numbers, those numbers were supplied by another source, I simply reiterated them. ----- Original Message ----- From: "Jim Choate" <ravage@ssz.com> To: <cypherpunks@einstein.ssz.com> Sent: Saturday, August 11, 2001 6:19 PM Subject: CDR: Re: Mixmaster Message Drops
On Wed, 8 Aug 2001, Joseph Ashwood wrote:
Well assuming that the remailers are under attack, we start using
digital
signatures with initiation information stored in them. Mallet can introduce duplicates,
Duplicates are not drops, signatures do nothing for drops. You're changing the rules in the middle of the game.
Actually if you are simply testing the number of messages that come in versus the number that go out, duplicates are a worry. If we are ignoring the content then a message stream of 1,2,3,4,5,6,7,8,9,10,11,121,13,14,15,16,17,18,19,20 looks identical to 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1 so that failure mode needs to be addressed, the individual signatures addresses that issue, which means that we can distinguish between the two message streams. This allow us to detect that numbers 13 and 17 for example) got dropped, and to cover the seperatation in the stream Mallet duplicated messages 7 and 11. This gives us a level of tracability that we can enforce ourselves outside of the network system. I believe that detecting and eliminating duplicates eliminates a very important activity that Mallet could perform to throw off our measurements.
Q: How to inject traffic into the remailer network anonymously?
through a set of trusted remailers,
Which we don't have if we accept your numbers. Depending on the technology you're trying to vet is a recipe for disaster (well Mallet won't think so).
Actually you can start with just one trusted remailer. If you can get in an personally inspect 1 remailer, or run it yourself, you can trust a single one. Once the single trust location has been established you begin routing information through that single entry point, and make use of that entry point to measure to depth 2. Once you have built trust in a depth 2 entry point, you can then test it as a depth 1, making sure that mallet doesn't allow just a single entry point proper passthrough. From there you will have 2+ entry points to begin more depth 2 tests, from 2+ locations to begin with, repeat until the trust base has reached the necessary levels. Of course this testing has to be maintained continually, but the ability to send a couple dozen messages through each remailer each day should provide enough maintenance power. Joe

On Sat, 11 Aug 2001, Joseph Ashwood wrote:
Actually you can start with just one trusted remailer.
Bull.
If you can get in an personally inspect 1 remailer, or run it yourself, you can trust a single one.
Only so long as you have a process to vet its current behabiour against past behaviour. As to remailers you don't operate, you're trust with respect to 'getting in' lasts until you walk out. Once you're gone the potential for re-config'ing the remailer is present. -- ____________________________________________________________________ natsugusa ya...tsuwamonodomo ga...yume no ato summer grass...those mighty warriors'...dream-tracks Matsuo Basho The Armadillo Group ,::////;::-. James Choate Austin, Tx /:'///// ``::>/|/ ravage@ssz.com www.ssz.com .', |||| `/( e\ 512-451-7087 -====~~mm-'`-```-mm --'- --------------------------------------------------------------------

----- Original Message ----- From: "Jim Choate" <ravage@ssz.com> To: <cypherpunks@einstein.ssz.com> Sent: Saturday, August 11, 2001 7:07 PM Subject: CDR: Re: Mixmaster Message Drops
On Sat, 11 Aug 2001, Joseph Ashwood wrote:
Actually you can start with just one trusted remailer.
Bull.
Well technically you begin by granting trust to someone, the easiest being yourself. Then you build trust in something else, use trust in one thing to build trust in another. Regardless you have to establish trust in a single location, before you can build trust in multiple.
If you can get in an personally inspect 1 remailer, or run it yourself,
you
can trust a single one.
Only so long as you have a process to vet its current behabiour against past behaviour.
I'm only worrying about future behavior, the past being something that will never matter again in the behavior of the system, especially since we can't plan on testing "in the past." In this particular case we are only concerned with whether or not it will forward every message sent to it within some very tight bounds so it is unlikely that a non-malicious entity would change the configuration, if the configuration is changed in that respect then it will be detected later on during the testing phase. The issue being that it will represent other remailers as undependable, possibly while making itself look flawless. This is a very difficult problem to solve.
As to remailers you don't operate, you're trust with respect to 'getting in' lasts until you walk out.
But we can build trust in it's ability to forward messages in a way that is for some definition anonymous, in particular the project is to eliminate it's dropping of messages either at random or maliciously, or at least determine which remailers perform the dropping and with what rate.
Once you're gone the potential for re-config'ing the remailer is present.
Which won't matter because the particular behavior we are trying to detect will be reported on if they change, simply by the fact that the messages will be dropped. All we're trying to do is build trust in the ability of a single location to begin an anonymous analysis of the abilities of the others to forward messages without dropping any (save a small ratio). Joe

On Sat, 11 Aug 2001, Joseph Ashwood wrote:
On Sat, 11 Aug 2001, Joseph Ashwood wrote:
Actually you can start with just one trusted remailer.
Bull.
Well technically you begin by granting trust to someone, the easiest being yourself. Then you build trust in something else, use trust in one thing to build trust in another. Regardless you have to establish trust in a single location, before you can build trust in multiple.
The point is that one remailer isn't sufficient (especially if you've got some sort of fixed latency, which is pretty much a given even if the window is 24 hours). Assume a single remailer. Both input and output stream are sampled. We must assume that the from: and to: are observable. To deny the from: implies a secondary anonymizing layer, contrary to our intial assertion. To deny to: implies that there is an additional layer between the last remailer and the recipient of the traffic. The goal of a remailer is to break the connection between a given incoming and a given out-going (ie mix the to:/from: relation) - not to hide either necessarily. So, even if we generate cover traffic a comparison of all outgoing with each incoming is clearly realizable using modest resources (not including the tap itself). As long as your sample window around each from: event is greater than the latency, you'll find a statistical correlation to the to: event. It takes at least two remailers and they must in addition talk with a shared keyset different than either from: or to: might use to encrypt the body. Now, let's talk about that cover traffic for a moment... It's random, so subsequent comparisons based around to: or from: events will cancel out because they won't be repeated often enough. There is the additional question of just exactly who we're sending that cover traffic to since we're the only remailer. In addition there is the problem of exactly how we produce all that bogus traffic. My suggestion is keep a steady flow of messages (ie #msg/t >= C, dmsg#/dt = 0) to a well known set of addresses (remailers of course) and then randomly replace the bogus traffic with real traffic in a round robin selection process. Any real-world realizible system requires(!) at least two remailers who are not(!) in collusion with each other, and they must share a different set of keys than either to: or from: might use. The only workable solution to that, which doesn't involve 'trust' per se, is to move to a distributed environment with autonomous process execution. There is of course a discussion of self-sufficiency w/ respect to e-cash schemes in here, but that's a different topic. Additionaly there is the Plan 9 process bidding mechanism that fits in here as well. Then you request a image of that program (which is running other images elsewhere) to run in your own process space. You can verify the signature of the image by comparing it to known good signatures (and the algorithm they were used). This way each image of the program gets 'paid' for by being run by the user directly (though they have zero control over its environment mind you - excepting hacking the process mechanism in the OS - which recurses to the remailer getting a signature of the OS and itself to verify it hasn't been hacked. Some nifty problem/solution sets in here). You could of course do this today with the existing mix-masters to a certain extent. Linux would of course be a good example, though from a image used - signature perspective MS would be more stable with all the custom Linux kernel compiles (and over 180 distro's). (If any of these ideas are patentable then I put them in the public domain) -- ____________________________________________________________________ natsugusa ya...tsuwamonodomo ga...yume no ato summer grass...those mighty warriors'...dream-tracks Matsuo Basho The Armadillo Group ,::////;::-. James Choate Austin, Tx /:'///// ``::>/|/ ravage@ssz.com www.ssz.com .', |||| `/( e\ 512-451-7087 -====~~mm-'`-```-mm --'- --------------------------------------------------------------------
participants (4)
-
Jim Choate
-
Jim Choate
-
Joseph Ashwood
-
Nomen Nescio