Looking Inside a Cyberweapon Let's continue our exploration of the insides of tor, since if we're spamming a coder list we might as well do it with code. If you can crack this cyberweapon, maybe you can turn it against invading aliens like me. ``` src]$ less app/main/main.c ``` Turns out the next file is the same as the previous file, just a different spot. In `less`, you can hit `/` and then type some words and hit enter, to move forward to where those words are in the file. Hit `n` to go to the next occurrence. To go up to the top of the file to cycle again, hit `g` So: `g` `/run_tor_main_loop` <enter> And there it is. The main loop of tor. ``` int run_tor_main_loop(void) { handle_signals(); timers_initialize(); initialize_mainloop_events(); ``` No comments here. Different. Three more initialization calls, but since they sound like smaller concepts being initialized than before. ``` /* load the private keys, if we're supposed to have them, and set up the * TLS context. */ if (! client_identity_key_is_set()) { if (init_keys() < 0) { log_err(LD_OR, "Error initializing keys; exiting"); return -1; } } ``` Okay, it seems like tor has a client identity key, which makes a ton of sense. These calls to `client_identitiy_key_is_set` and `init_keys` imply that tor stores its state in static global data. This means a single tor process can't run two tor nodes. It's easy to change but involves slightly altering words throughout the entire source. Meanwhile, functions are quicker to type this way, because you don't need to tell them which node they're working with. I'm now wondering where and how the running tor state is stored, and I could maybe visit `client_identity_key_is_set` or `init_keys` to start learning that. ``` /* Set up our buckets */ connection_bucket_init(); ``` The word 'bucket' is often used to represent a chunk of objects (objects are basically computer thoughts), grouped together to find and store them easily, often in a list. There's a certain kind of list called an array or a vector, that is stored as a big long line of data, all in a row, inside the computer. So when people with beards talk about arrays or vectors, they mean that the list is organized in an ordered line, inside the computer. When you just say 'list' it doesn't necessarily mean this. I imagine buckets as being part of an array or vector, but I don't really know. ``` /* initialize the bootstrap status events to know we're starting up */ control_event_bootstrap(BOOTSTRAP_STATUS_STARTING, 0); ``` Maybe this is referencing the control protocol of Tor, a server it runs that other programs can connect to control it, if you authorize them. ``` /* Initialize the keypinning log. */ if (authdir_mode_v3(get_options())) { char *fname = get_datadir_fname("key-pinning-journal"); int r = 0; if (keypin_load_journal(fname)<0) { log_err(LD_DIR, "Error loading key-pinning journal: %s",strerror(errno)); r = -1; } if (keypin_open_journal(fname)<0) { log_err(LD_DIR, "Error opening key-pinning journal: %s",strerror(errno)); r = -1; } tor_free(fname); if (r) return r; } { /* This is the old name for key-pinning-journal. These got corrupted * in a couple of cases by #16530, so we started over. See #16580 for * the rationale and for other options we didn't take. We can remove * this code once all the authorities that ran 0.2.7.1-alpha-dev are * upgraded. */ char *fname = get_datadir_fname("key-pinning-entries"); unlink(fname); tor_free(fname); } ``` Key-pinning often means a way to store what the identities of things are, so as to reject them if they change. Don't know for sure. ``` if (trusted_dirs_reload_certs()) { log_warn(LD_DIR, "Couldn't load all cached v3 certificates. Starting anyway."); } ``` Ouch! Nobody will ever see this warning. It likely relates to knowing you are connecting to the nodes that you think you are. It looks unlikely. ``` if (router_reload_consensus_networkstatus()) { return -1; } /* load the routers file, or assign the defaults. */ if (router_reload_router_list()) { return -1; } ``` I wonder if it's connecting to nodes on the internet yet. I'm curious what `router_reload_consensus_networkstatus` does. ``` /* load the networkstatuses. (This launches a download for new routers as * appropriate.) */ const time_t now = time(NULL); directory_info_has_arrived(now, 1, 0); ``` This sounds pretty important. Better check what `directory_info_has_arrived` does. ``` if (server_mode(get_options()) || dir_server_mode(get_options())) { /* launch cpuworkers. Need to do this *after* we've read the onion key. */ cpu_init(); } consdiffmgr_enable_background_compression(); /* Setup shared random protocol subsystem. */ if (authdir_mode_v3(get_options())) { if (sr_init(1) < 0) { return -1; } } ``` We can confirm with "launch cpuworkers" and `background_compression` that it's using multiple threads now. It's apparently now already loaded an 'onion key'. ``` /* initialize dns resolve map, spawn workers if needed */ if (dns_init() < 0) { if (get_options()->ServerDNSAllowBrokenConfig) log_warn(LD_GENERAL, "Couldn't set up any working nameservers. " "Network not up yet? Will try again soon."); else { log_err(LD_GENERAL,"Error initializing dns subsystem; exiting. To " "retry instead, set the ServerDNSAllowBrokenResolvConf option."); } } ``` This doesn't look that interesting; I think tor keeps an internal mapping of dns names. ``` #ifdef HAVE_SYSTEMD { const int r = sd_notify(0, "READY=1"); if (r < 0) { log_warn(LD_GENERAL, "Unable to send readiness to systemd: %s", strerror(r)); } else if (r > 0) { log_notice(LD_GENERAL, "Signaled readiness to systemd"); } else { log_info(LD_GENERAL, "Systemd NOTIFY_SOCKET not present."); } } #endif /* defined(HAVE_SYSTEMD) */ ``` I'm thinking I can totally ignore this since many valid systems don't have systemd. ``` return do_main_loop(); } ``` Here we get handed off to yet another main loop. Familiar! ``` src]$ grep -r do_main_loop\( . ./app/main/main.c: return do_main_loop(); ./core/mainloop/mainloop.c:do_main_loop(void) ``` ``` $ grep -r directory_info_has_arrived\( . ./app/main/main.c: directory_info_has_arrived(now, 1, 0); ./core/mainloop/mainloop.c:directory_info_has_arrived(time_t now, int from_cache, int suppress_logs) ``` Looks like `core/mainloop/mainloop.c` is the next place to go.