Tor, the pentagon's cyberweapon

Karl gmkarl at gmail.com
Tue Oct 13 13:03:19 PDT 2020


Looking Inside a Cyberweapon

Let's continue our exploration of the insides of tor, since if we're
spamming a coder list we might as well do it with code.  If you can
crack this cyberweapon, maybe you can turn it against invading aliens
like me.

```
 src]$ less app/main/main.c
```

Turns out the next file is the same as the previous file, just a
different spot.  In `less`, you can hit `/` and then type some words
and hit enter, to move forward to where those words are in the file.
Hit `n` to go to the next occurrence.  To go up to the top of the file
to cycle again, hit `g` So:

`g`
`/run_tor_main_loop` <enter>

And there it is.  The main loop of tor.

```

int
run_tor_main_loop(void)
{
  handle_signals();
  timers_initialize();
  initialize_mainloop_events();
```
No comments here.  Different.  Three more initialization calls, but
since they sound like smaller concepts being initialized than before.

```
  /* load the private keys, if we're supposed to have them, and set up the
   * TLS context. */
  if (! client_identity_key_is_set()) {
    if (init_keys() < 0) {
      log_err(LD_OR, "Error initializing keys; exiting");
      return -1;
    }
  }
```
Okay, it seems like tor has a client identity key, which makes a ton
of sense.  These calls to `client_identitiy_key_is_set` and
`init_keys` imply that tor stores its state in static global data.
This means a single tor process can't run two tor nodes.  It's easy to
change but involves slightly altering words throughout the entire
source.  Meanwhile, functions are quicker to type this way, because
you don't need to tell them which node they're working with.

I'm now wondering where and how the running tor state is stored, and I
could maybe visit `client_identity_key_is_set` or `init_keys` to start
learning that.

```
  /* Set up our buckets */
  connection_bucket_init();
```

The word 'bucket' is often used to represent a chunk of objects
(objects are basically computer thoughts), grouped together to find
and store them easily, often in a list.
There's a certain kind of list called an array or a vector, that is
stored as a big long line of data, all in a row, inside the computer.
So when people with beards talk about arrays or vectors, they mean
that the list is organized in an ordered line, inside the computer.
When you just say 'list' it doesn't necessarily mean this.  I imagine
buckets as being part of an array or vector, but I don't really know.

```
  /* initialize the bootstrap status events to know we're starting up */
  control_event_bootstrap(BOOTSTRAP_STATUS_STARTING, 0);
```
Maybe this is referencing the control protocol of Tor, a server it
runs that other programs can connect to control it, if you authorize
them.

```
  /* Initialize the keypinning log. */
  if (authdir_mode_v3(get_options())) {
    char *fname = get_datadir_fname("key-pinning-journal");
    int r = 0;
    if (keypin_load_journal(fname)<0) {
      log_err(LD_DIR, "Error loading key-pinning journal: %s",strerror(errno));
      r = -1;
    }
    if (keypin_open_journal(fname)<0) {
      log_err(LD_DIR, "Error opening key-pinning journal: %s",strerror(errno));
      r = -1;
    }
    tor_free(fname);
    if (r)
      return r;
  }
  {
    /* This is the old name for key-pinning-journal.  These got corrupted
     * in a couple of cases by #16530, so we started over. See #16580 for
     * the rationale and for other options we didn't take.  We can remove
     * this code once all the authorities that ran 0.2.7.1-alpha-dev are
     * upgraded.
     */
    char *fname = get_datadir_fname("key-pinning-entries");
    unlink(fname);
    tor_free(fname);
  }
```
Key-pinning often means a way to store what the identities of things
are, so as to reject them if they change.  Don't know for sure.

```
  if (trusted_dirs_reload_certs()) {
    log_warn(LD_DIR,
             "Couldn't load all cached v3 certificates. Starting anyway.");
  }
```
Ouch!  Nobody will ever see this warning.  It likely relates to
knowing you are connecting to the nodes that you think you are.  It
looks unlikely.

```
  if (router_reload_consensus_networkstatus()) {
    return -1;
  }
  /* load the routers file, or assign the defaults. */
  if (router_reload_router_list()) {
    return -1;
  }
```
I wonder if it's connecting to nodes on the internet yet.  I'm curious
what `router_reload_consensus_networkstatus` does.

```
  /* load the networkstatuses. (This launches a download for new routers as
   * appropriate.)
   */
  const time_t now = time(NULL);
  directory_info_has_arrived(now, 1, 0);
```
This sounds pretty important.  Better check what
`directory_info_has_arrived` does.

```
  if (server_mode(get_options()) || dir_server_mode(get_options())) {
    /* launch cpuworkers. Need to do this *after* we've read the onion key. */
    cpu_init();
  }
  consdiffmgr_enable_background_compression();

  /* Setup shared random protocol subsystem. */
  if (authdir_mode_v3(get_options())) {
    if (sr_init(1) < 0) {
      return -1;
    }
  }
```
We can confirm with "launch cpuworkers" and `background_compression`
that it's using multiple threads now.  It's apparently now already
loaded an 'onion key'.

```
  /* initialize dns resolve map, spawn workers if needed */
  if (dns_init() < 0) {
    if (get_options()->ServerDNSAllowBrokenConfig)
      log_warn(LD_GENERAL, "Couldn't set up any working nameservers. "
               "Network not up yet?  Will try again soon.");
    else {
      log_err(LD_GENERAL,"Error initializing dns subsystem; exiting.  To "
              "retry instead, set the ServerDNSAllowBrokenResolvConf option.");
    }
  }
```
This doesn't look that interesting; I think tor keeps an internal
mapping of dns names.

```
#ifdef HAVE_SYSTEMD
  {
    const int r = sd_notify(0, "READY=1");
    if (r < 0) {
      log_warn(LD_GENERAL, "Unable to send readiness to systemd: %s",
               strerror(r));
    } else if (r > 0) {
      log_notice(LD_GENERAL, "Signaled readiness to systemd");
    } else {
      log_info(LD_GENERAL, "Systemd NOTIFY_SOCKET not present.");
    }
  }
#endif /* defined(HAVE_SYSTEMD) */
```
I'm thinking I can totally ignore this since many valid systems don't
have systemd.

```
  return do_main_loop();
}
```
Here we get handed off to yet another main loop.  Familiar!

```
 src]$ grep -r do_main_loop\( .
./app/main/main.c:  return do_main_loop();
./core/mainloop/mainloop.c:do_main_loop(void)
```

```
$ grep -r directory_info_has_arrived\( .
./app/main/main.c:  directory_info_has_arrived(now, 1, 0);
./core/mainloop/mainloop.c:directory_info_has_arrived(time_t now, int
from_cache, int suppress_logs)
```

Looks like `core/mainloop/mainloop.c` is the next place to go.


More information about the cypherpunks mailing list