---------- Forwarded message ---------- From: Roger Dingledine <arma@mit.edu> Subject: Re: [tor-talk] Clarification of Tor's involvement with DARPA's Memex On Fri, Apr 17, 2015 at 05:38:37PM +0100, Thomas White wrote:
there is some references to DARPA collaborating with some developers from Tor Project. I'd like to ask the developers of Tor to clarify what this involvement entails and why effort is being put towards a LE tool instead of working on hiding Tor users through improving anonymity or developing more circumvention based-tech.
Hi Thomas, Thanks for asking. I apologize for not explaining these answers earlier. I'm still trying to find the right balance for my time between mentoring people in the Tor community vs better broader communication too. Let me give you some background, and then I'll answer your question. First of all, yes indeed we've been getting some funding from the Memex project. This is what has allowed us to pay attention to and move forward on some of the really cool things we've been working on lately for hidden services: * Fixing many performance and consistency problems with hidden services, e.g.: https://trac.torproject.org/projects/tor/ticket/11447 https://trac.torproject.org/projects/tor/ticket/13211 https://trac.torproject.org/projects/tor/ticket/13447 https://trac.torproject.org/projects/tor/ticket/13700 https://trac.torproject.org/projects/tor/ticket/14219 https://trac.torproject.org/projects/tor/ticket/14224 * Fleshing out the design and analysis for the "direct onion service" option that folks like Facebook want: https://lists.torproject.org/pipermail/tor-dev/2015-April/008625.html plus discussing other tradeoffs between upcoming design choices: https://lists.torproject.org/pipermail/tor-dev/2015-April/008597.html * The work to let Tor controllers configure a hidden service directly without using the torrc file, which the Globaleaks folks (among others) are really excited to start using: https://trac.torproject.org/projects/tor/ticket/6411 * The privacy-preserving statistics that let us conclude numbers like "3-4% of Tor traffic is hidden service related" and "there are around 30000 hidden services today": https://blog.torproject.org/blog/some-statistics-about-onions * Assessing, triaging, and putting out new Tor releases to fix hidden service security (stability) bugs recently: https://blog.torproject.org/blog/tor-02512-and-0267-are-released * I hear that Rob Jansen and others have been working on a more realistic replacement for TorPerf (https://gitweb.torproject.org/torperf.git) which will let us measure performance to a hidden service over time and better understand where the bottlenecks are. * I've also been talking to EFF about kicking off a Tor Onion Challenge (to follow on from their Tor Relay Challenges), to a) get many people to make their website or other service accessible as an onion site, and b) come up with and/or build a novel use of onion services, to go with the quite cool list that we have already but have done a poor job of publicizing: Pond, Globaleaks, SecureDrop, Ricochet, OnionShare, facebook's https onion, etc. You see, I used to be on the "making your normal website reachable as an onion service is stupid" side of the fence, but I have since come to realize that I was wrong. You know how, ten years ago, website operators would say "I don't need to offer https for my site, because my users ____" and they'd have some plausible-sounding excuse? And now they sound selfish and short-sighted if they say that, because everybody knows it should be the choice of the *user* what security properties she gets when reaching a service? I now think onion services are exactly in that boat: today we have plenty of people saying "I don't need to offer a .onion for my site, because my users _____". We need to turn it around so sites let their *users* decide what security (encryption, authentication, trust) properties they want to achieve while interacting with each site. Our "3-4%" stat has actually been used by some of the other people (at other groups) who are funded by Memex. They're talking to (among others) the child porn division of the Department of Justice, and I've taught them enough about Tor that they've basically turned into Tor advocates on our behalf. They've found actual numbers to be really useful at countering the FUD that some government people start out with. One of these people explained to me last week that they listen to her more than she thinks they'd listen to me, since she shows up as a neutral party. In any case I am happy to have more people working on the "teach law enforcement how Tor actually works" topic, which you can read more about here: https://blog.torproject.org/blog/trip-report-tor-trainings-dutch-and-belgian... https://blog.torproject.org/blog/trip-report-october-fbi-conference We do indeed need to be very careful and very thoughtful about what things in the Tor network are safe to measure. The general heuristic we've been using so far is: "Is that measurement taking advantage of something that you could instead fix? If so, it's not ok to measure it." A prime example here of what's over the line is running relays that get the HSDir flag and then recording what hidden service descriptors they see (and thus what hidden services they learn about). We would instead like to treat that as a vulnerability and fix it: https://trac.torproject.org/projects/tor/ticket/8106 https://trac.torproject.org/projects/tor/ticket/8243 https://trac.torproject.org/projects/tor/ticket/8244 and see also the "Attacks by Hidden Service Directory Servers" section of https://blog.torproject.org/blog/hidden-services-need-some-love as well as the section after it. (There are other researchers who have used that technique, e.g. http://freehaven.net/anonbib/#oakland2013-trawling and also Gareth Owen's talk at 31c3. But we need to hold ourselves to a higher standard.) On the other hand, if people publish a .onion address on a normal website and Google runs across it and indexes the name, then it seems clear that that's public information. There are many other ways to learn about hidden service names which are ethically in-between, e.g. http://blogs.verisigninc.com/blog/entry/new_from_verisign_labs_measuring1 These are great topics for us as a community to keep discussing. Similarly, if your .onion address is public, and your webserver doesn't require any authentication, and somebody fetches the content on it... that also seems like public information. And if, for example, the onion service is a forum, and users go there and then write their names down or provide other identifying information, that isn't really a bug or design flaw that Tor can fix. These days there are services like Ahmia that list and index a bunch of onion names and content: https://ahmia.fi/search/ And to be clear, I think this is a great trend: we need to make onion services easier to understand and more accessible (and faster and more robust) for ordinary people, or we'll remain stuck with all the metaphors that include the word 'dark'. Ok, now that I've provided some background, I should try to answer your question more clearly: we're using the Memex money to make hidden services stronger, and we're teaching other people how Tor works. In terms of teaching, it's the same thing I do for every other audience: explain about all the projects Tor works on (Tor, Tor Browser, pluggable transports, metrics, OONI, ...), which projects do what, how to measure and assess Tor's anonymity, what problems we don't have great answers for, and so on. Part of making Tor work better means that it works better for these people too. And some of these people are indeed working on tools to gather and organize public content from hidden services, with the intent that groups like law enforcement will find their tools useful. We're not working on these tools, but when Tor becomes better (for everybody) these tools work better (for the groups they have in mind). It is a tricky balance, but I think we have the balance right in this case. Would I rather have funding where it's easier to find a good balance? Absolutely. That's a major part of why we've been talking about funding and funding diversity so much lately, and why we've been thinking about crowdfunding specifically for hidden service design improvements, and about growing our donation base and sustainability through donations and other avenues. We need help from all of you to get there. I don't want to play the "they'd do it anyway" card too strongly here -- first because who knows, maybe they wouldn't, and second because there are definitely some activities that you stay away from no matter the balance. I've talked a lot with the program manager of Memex, and he's completely supportive of the "don't weaken Tor" mandate. In that sense we're aligned: he very strongly believes that weakening Tor would screw up this balance. I trust his intentions, and in any case we're the ones doing the technical side of Tor so we can make sure that we do the right thing. I should also make clear my opinion on some of the bad uses of Tor. The folks who are using Tor for child porn, even though they are a tiny fraction of overall Tor users, are greatly hurting Tor -- by changing or reinforcing public perceptions of what privacy is for, and also by attracting the attention and focus of law enforcement and making that the way that law enforcement first learns about Tor. So, fuck them, they should get off our network, that's not what Tor is for and they're hurting all of us. Now, that doesn't mean we should weaken Tor, even if we don't want them on the network. That slope is too easy to slip down, and we must not get into the business of dictating what is acceptable behavior for Tor users (which would eventually lead to designing technical mechanisms to enforce these choices). I just went back to re-read the Forbes article, and in retrospect it sure makes it look like all of these companies are working on tools that relate to Tor hidden services. They aren't. The main focus for Memex is on automatically parsing and collecting info from ads on e.g. craigslist, and generally getting better at the 'big data' side of searching and organizing this data. More generally, Memex is made up of a bunch of different companies, each doing their thing. I guess this is another casualty of the ambiguity of the phrases 'dark web' and 'deep web', since journalists find them hot to talk about but nobody reliably knows what they refer to. If you want to follow along with the actual technical work we're doing, I invite you to observe or participate in the periodic "SponsorR" meetings that happen on IRC: https://trac.torproject.org/projects/tor/wiki/org/sponsors/SponsorR http://meetbot.debian.net/tor-dev/ Thanks, --Roger