Example: A routes packet P, via nodes B then C, to node D. Nodes: A B C D PK means "public key", each node knows the PK of the other nodes it communciates with or routes to. Using PK is shorthand - actual encrypted link usually uses PK crypto to negotiate a symmetric session key, which might be changed over time and other complexities, but for the purpose of comprehending basic implementation steps and understanding issues arising, it's sufficient to think in simpler terms of just per node public keys. "Physical" hops in the following example routes: AB BC CD "Virtual" routes we can consider in examples: AB ABC ABCD and also BC, BD, and CD Each hop, e.g. AB, needs their comms to be encrypted to protect from eves droppers. So when A sends P to B, P is encrypted with B's PK. We can use a notation like B(P) to mean encrypt P with B's PK. Similarly notation B'(B(P)) to mean decrypt B(P) to get the raw/unencrypted packet P, from the encrypted version B(P). 1) A sends P only to B 1.1 A encrypts P to B, i.e. sending B(P) to B 1.2 B decrypts, doing B'(B(P)) to get P 2) A sends P to C, via B A wants B to not read P, only to forward C(P) on to C; So, naievely, A sends C(P) to B, with a request to forward C(P) onwards to C. This is naieve, since two problems arise: i) A's request to B is not encrypted, only P is encrypted as C(P). ii) A sends the encrypted packet C(P) first to B (with the routing instruction), and then B also sends C(P) (without any additional routing instructions), to C, but the same set of encrypted bytes has been sent first via AB, and second via BC; the obvious problem with this is that an onlooker (GPA), can simply see that "something" (encrypted) has been sent from A to C, which defeats the purpose of A sending via B at all - there is no benefit over just sending C(P) directly to C. This particular sub-problem can be solved in a couple of immediate ways: I) B receives C(P) from A, so instead of sending that to C, it sends C(C(P)) to C, which C can decrypt, but C needs to know, or guess how many layers to unwrap, and this puts extra unnecessary load on C - the more hops, the more load on the end point of the route, aka not good. II) A first encrypts P using C's PK, then does similarly for node B, so sending B(C(P)) to B, which B decrypts to get C(P), which B can on-send to node C with no extra encryption effort! This puts the load on A, to encrypt its packets according to the total route desired. This is logical, as A wants to minimize trust in any downstream nodes, so doing the work yourself pays off with certainty and less need for trust, greater privacy, and while A is at it, it can embed the required routing header for each hop, into the appropriately layer, thus encrypting all routing requests. A routing header can be denoted with the letter "r", followed if needed by the node letter for the target node of the route request, e.g. "rC" means "route this packet to node C". So for A to 'properly' send P to node C: 2.1 A sends B(rC+C(P)) to node B 2.2 B decrypts, getting rC + C(P) 2.3 B sends C(P) to node C An obvious information exposure bug in steps 2.1 and 2.3 is that the unless we somehow compensate for this, the packet B(rC+C(P)), is larger than the packet C(P) - so our GPA snoop can see that A's packet sent to B, is larger than B's packet sent to C, and it's not hard to guess that the difference is gonna be routing information. The simple solution to different packet sizes is to rely on B to pad C(P) with some random bytes (which need to be ignored by C), so that it's the same size as B(rC+C(P)). But this means once again, relying on B to do the right thing here. Can we get around this reliance on B? The problem only gets worse as we go to an example that includes the next hop, CD... Separating routing requests into separate packets ("control" packets or whatever we might call them) would have downsides of extra buffering and latency to handle dropped packets, packet sequence problems (we're talking UDP here), lower-level packet routing timing issues, requests to resend packets that don't arrive after a timeout etc etc. This is way too complex (reimplementing TCP basically), so forget that - routing requests must be embedded, attached or otherwise be very closely associated, with the packets being routed. Whether routing info is a "tag", a "byte", an IP (v4 or v6) address or etc doesn't matter - a smaller header doesn't make the problem go away, AFAICS.