# [mnet-devel] Grid Of Trust -- pre-design

Zooko O'Whielacronx zooko at zooko.com
Tue Nov 25 06:37:31 PST 2003

``` Some Guy wrote:
>
> Hey guys who ever wants to take first peek at my baby DHT/premix idea please
take a peek and
> crtitique it.  It's a bit rough around the edges and there's plenty of room
for modification, but
> I'm pretty sure the ideas are sound.
>
> http://de.geocities.com/amichrisde/Grid_Of_Trust.html
>
> The basic idea is to force data and nodes to be randomly distributed in a
grid.  This provides
> resistance against serveral types of attacks.
>
> Feel free to comment on or off list.

Here is the text of that web page, for the sake of future generations who are
required to read the mnet-devel archives as part of their education:

Grid Of Trust

This isn't a design document.  It's more of a proposal for what could be
to make sure all the attacks are really covered.

Topology

Nodes are distributed randomly into cells of a hypercube with d
dimensionals.  Each axis is broken down into g pieces such that the
hypercube contains gd cells.

Nodes in a cell can connect to all nodes in cells whose coordinates vary
in at most one dimension.  This is different than CAN where only dirrect
neighbors have links.  A DHT based on this type of a network should only
require d hops to handle a request.  The cost is of coarse a higher
neihbor count.

We will define x as the expected population or a cell and N as the total
number of nodes.
x = N/(gd)

We will define m as the expected neighbor count.
m = x(d*(g-1)+1)

So let us visualize a small network with g=8 and d=2 which looks like a
chess board.  Let N=256 so x=4.  Messages can move like rooks on this
chess board, so requests can be handled in 2 hops.  So we can caculate the
neihbor count as:
m = 4*(15) = 60

Now let us consider a larger network g=10 d=6 and x=4, so N=4,000,000.
Messages should be routable in 6 hops and for the niehbor count we get:
m = 4*(6*9+1) = 220

When x and g are held constant m is O(log(N)) just like CAN.

For large N the probability a cell is vacant is e-x.  So at x=4 this
chance is about 1.83%.  This should be acceptable if some redundancy is
used.  Not that when a message has serveral dimensions to move along, the
chance that there is no dirrect way becomes very low.

Node Placement

Nodes must be placed randomly into the network to prevent, serveral types
of attacks which involve an adversary selecting the locations of the grid
he would like to control.  In order to do this a node is forced to pay for
it's identity with a large hash cash calculation.  The hash cash then
determines the node's location in the grid.

One space optimization, is to store all the identification in the public
key.  The first time user picks an N=pq.  He then tries to make an e such
that it's first 32 bits contain the julian date, it is relatively prime to
(p-1)(q-1), and SHA-1(<e,N>) has the first h bits zero.  We can then set h
to be so high it takes a 1CPU-day to calculate the key.

We then use some globaly defined salt per dimension Si, which we encript
with the <e,N> to give us the identities location L in the grid.
Li = (Si emod N) mod g

Boot Strapping

Boot strapping is pretty straight forward.  A new node must have contact
with one node in the system.  It uses this node to route a message to the
target cell, a node there sends its IP:Port encripted with the public key
of the identity back.  The node the decrypts this to get the IP out.  It
never has dirrect contact to any of the other nodes along the routing
path.  Once the new node connects to the node in its home cell, it can get
all the other IPs and keys from it.

It could be the case that the cell is vacant, or that one doesn't want to
trust the single node from the home cell.  It is easy for such a boot
strap message to be routed to find a node from a cell connected to the
target cell.  The same protocol can be used.  So one might have to repeat
the process d times to get nodes from all the dimensions.

Note that this type of bootstrapping will probably only have to happen
once for a node.  Once a strip of g*x nodes all know each other, some of
them can leave and join with out having to worry too much about not being
able to find anyone when they come back.

DHT Functionality

Similar to the way in which a node must be kept from deciding what cell it
wants to serve in, we must ensure that the cells to which keys for
requests and inserts are mapped can not be selected by an advesary.  Based
on the application key of which there may be many types a routing key is
calculated based on a small hash cash proportional to the some
requirements of the key like:
* size of the data
* priority of messages
* possiblity and maximum rate of rewrite
* expiration rate of the data
* expiration date of the key
The hash cash which is calculated will then be given back to the
application to use for the a pointer.  The actual routing key is the hash
of the hash cash and other key data that went into the hash cash
calculation.  This ensures that a hard hash cash calculation must be done
before determining what node(s) will be responsible for handling the key.

For example you want to make a pretty picture freesite.  For your picture
you know it's not going to change or at least nobody is going to link
dirrectly to the picture, so you calculate a cheap on time hash cash for
it, which is included in your HTML.  For your HTML you decide you want to
be able to change it once a day so you buy a more expensive hash cash for
it.  This is given to others who want to see or link to your site.

Requests with substandard hash cash don't have to be dropped they can just
be treated with lower priority.

File sharing data should ooz slowly through a network, while browsing
requests should skip over them.

Premix Routing

Tarzan provides a very good model for premix routing.  One problem is that
they rely on IP subnets as a resource to limit adversaries and they
require all nodes to contact all other nodes periodically in order to
verify identities.  The Grid solves this problem.

Since an advesary can only create a certain amount of valid identities and
those identities are spread around evenly, there is no way for an
advesarial node to present a large list of faulty neighbors.  If the
advesary provides the tunnel user with a very small list this would be
obvious too.

If that weren't enough, we could demand certificates from multiple nodes
in a cell to verify neighbor lists.

It should also be noted that the DHT traffic can provide perfect cover
traffic.

Resource Accounting

Since the relationships between nodes are long term and identity isn't
free, barder is a great economic model.  Over the long term two neihbor
nodes can add up how much work they did for each other, and do some simple
statistics to see if anyone is being a "leach".  No centralized authority
is required.

Premix bandwidth is always charged back the dirrection the tunnel
originates from.

Topology Aware Routing

Since routing can go in multiple dimensions and mutliple nodes inhabit the
same cells, there is great flexibility in which way messages can be
routed.  Messages can be forwarded along the path with the lowest latency
and to nodes with low load.  This is similare to how you navigate American
streets with a car avoiding the streets and avenues with too much traffic.

Attack Models

Cancer Node Censorship Attack

In this model an adversary runs a few nodes where particular requests are
likely to be handled in order to drop or mishandle them.

The Grid prevents this by randomly distributing expensive identities
throughout.  In that large 4,000,000 node network only 1 in a million
identities would be in the target cell.  If each identity cost 1 CPU-day,
an adversary would have to buy lots of hardware and work really long.

Spy and Flood

There is a slightly better chance of getting a node to wind up connected
to the target cell: 54/1,000,000.  From here the advesary could lauch
classic SYN floods or try to hack nodes in the target cell, since he would
know thier IPs.

This still is a prohibitive amount of  work. A little bit of redundacy
will also dramatically increase resistance further.

DHT Flood

In this attack model the advesary tries to generate messages, which all
must be handeled by nodes in the target area.  The hash cash should
prevent this from happening for most types of keys.  Since you have to pay
first and then you know the route.  Messages for things with more
expensive hash cash could be processed first, to avoid shortcoming of
cheaper key types.

boconstrictor attack --naa

Other Wacky Ideas

Clusters
Big Clusters
Using levels to deal with heterogenity
Baby Mode - Slower bootstrap
setstats 1

-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
help you create better code?  SHARE THE LOVE, and help us help
_______________________________________________
mnet-devel mailing list
mnet-devel at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mnet-devel

----- End forwarded message -----
-- Eugen* Leitl <a href="http://leitl.org">leitl</a>
______________________________________________________________
ICBM: 48.07078, 11.61144            http://www.leitl.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
http://moleculardevices.org         http://nanomachines.net

[demime 0.97c removed an attachment of type application/pgp-signature]

```