[mnet-devel] Grid Of Trust -- pre-design

Tue Nov 25 06:37:31 PST 2003

 Some Guy wrote:
>
> Hey guys who ever wants to take first peek at my baby DHT/premix idea please
take a peek and
> crtitique it.  It's a bit rough around the edges and there's plenty of room
for modification, but
> I'm pretty sure the ideas are sound.
>
> http://de.geocities.com/amichrisde/Grid_Of_Trust.html
>
> The basic idea is to force data and nodes to be randomly distributed in a
grid.  This provides
> resistance against serveral types of attacks.
>
> Feel free to comment on or off list.

Here is the text of that web page, for the sake of future generations who are
required to read the mnet-devel archives as part of their education:

                                 Grid Of Trust

   This isn't a design document.  It's more of a proposal for what could be
   to make sure all the attacks are really covered.

Topology

   Nodes are distributed randomly into cells of a hypercube with d
   dimensionals.  Each axis is broken down into g pieces such that the
   hypercube contains gd cells.

   Nodes in a cell can connect to all nodes in cells whose coordinates vary
   in at most one dimension.  This is different than CAN where only dirrect
   neighbors have links.  A DHT based on this type of a network should only
   require d hops to handle a request.  The cost is of coarse a higher
   neihbor count.

   We will define x as the expected population or a cell and N as the total
   number of nodes.
   x = N/(gd)

   We will define m as the expected neighbor count.
   m = x(d*(g-1)+1)

   So let us visualize a small network with g=8 and d=2 which looks like a
   chess board.  Let N=256 so x=4.  Messages can move like rooks on this
   chess board, so requests can be handled in 2 hops.  So we can caculate the
   neihbor count as:
   m = 4*(15) = 60

   --todo add pretty picture

   Now let us consider a larger network g=10 d=6 and x=4, so N=4,000,000.
   Messages should be routable in 6 hops and for the niehbor count we get:
   m = 4*(6*9+1) = 220

   This still would be acceptable for freenet like applications.

   When x and g are held constant m is O(log(N)) just like CAN.

   For large N the probability a cell is vacant is e-x.  So at x=4 this
   chance is about 1.83%.  This should be acceptable if some redundancy is
   used.  Not that when a message has serveral dimensions to move along, the
   chance that there is no dirrect way becomes very low.

Node Placement

   Nodes must be placed randomly into the network to prevent, serveral types
   of attacks which involve an adversary selecting the locations of the grid
   he would like to control.  In order to do this a node is forced to pay for
   it's identity with a large hash cash calculation.  The hash cash then
   determines the node's location in the grid.

   One space optimization, is to store all the identification in the public
   key.  The first time user picks an N=pq.  He then tries to make an e such
   that it's first 32 bits contain the julian date, it is relatively prime to
   (p-1)(q-1), and SHA-1(<e,N>) has the first h bits zero.  We can then set h
   to be so high it takes a 1CPU-day to calculate the key.

   We then use some globaly defined salt per dimension Si, which we encript
   with the <e,N> to give us the identities location L in the grid.
   Li = (Si emod N) mod g

Boot Strapping

   Boot strapping is pretty straight forward.  A new node must have contact
   with one node in the system.  It uses this node to route a message to the
   target cell, a node there sends its IP:Port encripted with the public key
   of the identity back.  The node the decrypts this to get the IP out.  It
   never has dirrect contact to any of the other nodes along the routing
   path.  Once the new node connects to the node in its home cell, it can get
   all the other IPs and keys from it.

   It could be the case that the cell is vacant, or that one doesn't want to
   trust the single node from the home cell.  It is easy for such a boot
   strap message to be routed to find a node from a cell connected to the
   target cell.  The same protocol can be used.  So one might have to repeat
   the process d times to get nodes from all the dimensions.

   Note that this type of bootstrapping will probably only have to happen
   once for a node.  Once a strip of g*x nodes all know each other, some of
   them can leave and join with out having to worry too much about not being
   able to find anyone when they come back.

DHT Functionality

   Similar to the way in which a node must be kept from deciding what cell it
   wants to serve in, we must ensure that the cells to which keys for
   requests and inserts are mapped can not be selected by an advesary.  Based
   on the application key of which there may be many types a routing key is
   calculated based on a small hash cash proportional to the some
   requirements of the key like:
     * size of the data
     * priority of messages
     * possiblity and maximum rate of rewrite
     * expiration rate of the data
     * expiration date of the key
   The hash cash which is calculated will then be given back to the
   application to use for the a pointer.  The actual routing key is the hash
   of the hash cash and other key data that went into the hash cash
   calculation.  This ensures that a hard hash cash calculation must be done
   before determining what node(s) will be responsible for handling the key.

   For example you want to make a pretty picture freesite.  For your picture
   you know it's not going to change or at least nobody is going to link
   dirrectly to the picture, so you calculate a cheap on time hash cash for
   it, which is included in your HTML.  For your HTML you decide you want to
   be able to change it once a day so you buy a more expensive hash cash for
   it.  This is given to others who want to see or link to your site.

   Requests with substandard hash cash don't have to be dropped they can just
   be treated with lower priority.

   File sharing data should ooz slowly through a network, while browsing
   requests should skip over them.

Premix Routing

   Tarzan provides a very good model for premix routing.  One problem is that
   they rely on IP subnets as a resource to limit adversaries and they
   require all nodes to contact all other nodes periodically in order to
   verify identities.  The Grid solves this problem.

   Since an advesary can only create a certain amount of valid identities and
   those identities are spread around evenly, there is no way for an
   advesarial node to present a large list of faulty neighbors.  If the
   advesary provides the tunnel user with a very small list this would be
   obvious too.

   If that weren't enough, we could demand certificates from multiple nodes
   in a cell to verify neighbor lists.

   It should also be noted that the DHT traffic can provide perfect cover
   traffic.

Resource Accounting

   Since the relationships between nodes are long term and identity isn't
   free, barder is a great economic model.  Over the long term two neihbor
   nodes can add up how much work they did for each other, and do some simple
   statistics to see if anyone is being a "leach".  No centralized authority
   is required.

   Premix bandwidth is always charged back the dirrection the tunnel
   originates from.

Topology Aware Routing

   Since routing can go in multiple dimensions and mutliple nodes inhabit the
   same cells, there is great flexibility in which way messages can be
   routed.  Messages can be forwarded along the path with the lowest latency
   and to nodes with low load.  This is similare to how you navigate American
   streets with a car avoiding the streets and avenues with too much traffic.

Attack Models

  Cancer Node Censorship Attack

   In this model an adversary runs a few nodes where particular requests are
   likely to be handled in order to drop or mishandle them.

   The Grid prevents this by randomly distributing expensive identities
   throughout.  In that large 4,000,000 node network only 1 in a million
   identities would be in the target cell.  If each identity cost 1 CPU-day,
   an adversary would have to buy lots of hardware and work really long.

  Spy and Flood

   There is a slightly better chance of getting a node to wind up connected
   to the target cell: 54/1,000,000.  From here the advesary could lauch
   classic SYN floods or try to hack nodes in the target cell, since he would
   know thier IPs.

   This still is a prohibitive amount of  work. A little bit of redundacy
   will also dramatically increase resistance further.

  DHT Flood

   In this attack model the advesary tries to generate messages, which all
   must be handeled by nodes in the target area.  The hash cash should
   prevent this from happening for most types of keys.  Since you have to pay
   first and then you know the route.  Messages for things with more
   expensive hash cash could be processed first, to avoid shortcoming of
   cheaper key types.

   boconstrictor attack --naa

Other Wacky Ideas

   Clusters
   Big Clusters
   Using levels to deal with heterogenity
   Baby Mode - Slower bootstrap
   setstats 1

-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive?  Does it
help you create better code?  SHARE THE LOVE, and help us help
YOU!  Click Here: http://sourceforge.net/donate/
_______________________________________________
mnet-devel mailing list
mnet-devel at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mnet-devel

----- End forwarded message -----
-- Eugen* Leitl <a href="http://leitl.org">leitl</a>
______________________________________________________________
ICBM: 48.07078, 11.61144            http://www.leitl.org
8B29F6BE: 099D 78BA 2FD3 B014 B08A  7779 75B0 2443 8B29 F6BE
http://moleculardevices.org         http://nanomachines.net

[demime 0.97c removed an attachment of type application/pgp-signature]