[tahoe-dev] switching from introducers to gossip?

Brian warner at lothar.com
Sun Jul 1 17:51:30 PDT 2012


On 6/14/12 1:02 PM, Zooko Wilcox-O'Hearn wrote:

> Brian has been posting patches that move away from using introducers
> at all in favor of "gossip". Now if I understand correctly, gossip is
> simply "every node is an introducer (in addition to whatever other
> jobs it does)".

Yeah, the general idea is that all nodes provide the "grid-control"
service, in addition to the "storage" service they might be providing
right now. Nodes announce both "grid-control" and "storage" via the same
introducer Announcements as before. The old Introducer becomes a node
that only provides "grid-control", on a pre-published FURL.

"grid-control" lets you publish Announcements (either your own or ones
you're forwarding from others), and subscribe to the same.

Once that is in place, and we have some code to prevent infinite
flooding loops, there are several different approaches you could take:

* fully-connected mesh: every node makes a Foolscap connection to every
  grid-control provider they hear about, subscribe to hear about all
  announcements, and publish any announcements that the other side
  doesn't already know about.
* opportunistic: clients only connect to storage servers, and storage
  servers don't make outbound connections to anybody, but if you *do*
  happen to be connected to someone who also offers "grid-control", then
  connect to their grid-control object too and exchange Announcements
* cluster-of-Introducers: normal nodes don't offer grid-control, but
  multiple Introducers do, and all of them know about each other. All
  nodes connect to all grid-control providers (which means all
  Introducers).
* one Introducer: this is just a degenerate cluster-of-Introducers

> Hrm. This idea of gossip conflicts with my idea that each server
> should attempt to connect to all clients -- and only to clients -- and
> that each client should attempt to connect to all servers -- and only
> to servers (#344, #1086).

I think we can probably accomodate that. I'm optimizing for our two main
use cases: friendnet and paid-service.

In the friendnet, nearly all nodes are both a client *and* a server.
Client-only nodes (like the one I occasionally connect to VG2 to
investigate bug reports), or server-only nodes (imagine a paid storage
server, the "rent-a-friend" idea I've talked about before) are rare. So
ruling out C->C or S->S connections doesn't change very much.

In the paid-service case (allmydata), we don't want clients talking to
each other (they're all behind NAT anyways). But we could allow S->S
connections without problems, and if all servers know about all other
servers, then we could add new servers to the grid by just connecting
them to at least one existing server, and knowledge of them would flood
quickly and reliably to everyone else.

> It would also interact somewhat poorly with #444

Note that we don't need active+online connections to all other nodes all
the time. Connecting with less than 100% duty cycle would still get the
information distributed eventually. What I'm really expecting is that
we'll use Zooko's clever log-scaling flooding techniques (from Mnet) to
limit the amount of traffic and connections but still achieve
rapid+reliable diffusion of knowledge.

> In fact, why do we need to switch from introducers to gossip at all?
> Could we finish the rest of the #466 new-introduction-protocol and
> related accounting infrastructure while leaving the current
> centralized introducer (or the #68 multiple introducers) alone?

They aren't interdependent, for sure. Now that #466 is in trunk, we've
got a handle on Announcements (i.e. the node key that signs each one) so
recipients can make decisions about whether they'll accept the thing
being introduced or not, independently of the channel by which they
received the announcement. *That* is important to unlock alternate
introduction topologies: without signed announcements, the only form of
grid control you can get is to limit who gets access to the Introducer
(as the VG2 folks accomplish by changing the introducer.furl each time
it is accidentally leaked). But with signed announcements, you don't
need control over the channel to retain control over which servers your
client uses, or over which clients your server will serve. You could
even safely use a single massive universe-spanning broadcast channel, if
you could make it efficient enough.

And the first steps of Accounting don't require changes to introduction
at all. These steps will enable tracking of who-uses-what, and manual
control (probably by pasting nodeids into tahoe.cfg) over both
which-servers-should-I-use and which-clients-should-I-accept. This needs
signed announcements (to get a strong nodeid of a server) and signed
accounting-facet-of-storage-server FURLification messages (so clients
can demonstrate control of a key). The main question is whether nodes
which are both clients and servers should have a single key, or two
separate keys (I prefer a single key, because it makes reciprocal
storage-permission grants easier).

The second steps of Accounting, where we try to make things easy and
automatic for our common use cases, is where we start getting into my
Invitation scheme, and is where gossip becomes more interesting. What I
really want is to make it super-easy for a new user to get their node
running and connected to their friend's existing grid. And, more
importantly, for that *first* friend to set up that grid.

Imagine for a moment that we have a nicely-packaged OS-X or debian app,
already distributed via the mac App Store or through apt/etc. And also
imagine that we've got uPnP working (or something equivalent, maybe
involving a relay or some helper service that we run), so NAT isn't a
problem. Then this is my goal:

  The first friend (Alice) hears about Tahoe from her favorite blog, and
  installs it with her favorite package manager. She lauches it for the
  first time, and it asks "start your own grid, or join someone
  else's?", and she picks "start your own". Her node starts up,
  establishes an external IP address, sets itself up to restart at
  reboot, and announces that Alice is now the proud member of a 1-node
  grid, and that she should invite a few friends to join before she'll
  get more than educational value out of the system. She hits the
  "Invite A Friend" button, types Bob's (pet)name and email address into
  the box, and the node sends Bob a message with links to the
  application, instructions, and an invitation code.

  Bob gets Alice's email, downloads+installs the app, and pastes in the
  invitation code. The next thing he sees is a picture of the two-node
  grid, with the Alice and Bob nodes labeled, and he can upload files
  and either retrieve them locally or share them with Alice.

  Later, Alice and Bob invite other people to join in their grid. The
  only grid-specific coordinates that each new member needs is a
  single-use invitation code like "d77hbsmkgeufjpwacu3ywkbwem".
  Eventually, Alice leaves the grid, but her departure doesn't affect
  the remaining members: they can still connect and exchange shares as
  usual.

  All grid members get a control panel where they can see who else is
  using their storage, allow/deny access, and control where their own
  node places shares. By default, anyone who gets invited to join the
  grid gets full access to storage on all members' servers, but access
  can be revoked at any time.

The corresponding story with an AllMyData-like paid-service is:

  Alice visits allmydata.com, signs up for the service with a credit
  card, downloads the client app and gets an invitation code for her
  account. She pastes the invitation code into the "accept an
  invitation" box when her app starts up.

  Her app connects to all AllMyData storage servers and is allowed
  storage access. New servers can be added without Alice's involvement.
  Any subset of the servers can go away without affecting her ability to
  connect to (or learn about) the rest.

To support those stories, I don't want Alice (or AllMyData) to be
running a single Introducer, or even a cluster of Introducers. Alice,
Bob, and the other members of the friendnet should *all* be helping each
other connect to the rest of their grid.. otherwise they have to pay
attention to how many Introducers are present, and who's responsible for
them, and make sure there are enough left available to accomodate
changes.

Does that help explain my interest in gossip-based introduction?

cheers,
 -Brian
_______________________________________________
tahoe-dev mailing list
tahoe-dev at tahoe-lafs.org
https://tahoe-lafs.org/cgi-bin/mailman/listinfo/tahoe-dev

----- End forwarded message -----
-- 
Eugen* Leitl <a href="http://leitl.org">leitl</a> http://leitl.org
______________________________________________________________
ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE





More information about the cypherpunks-legacy mailing list