[spam][crazy][log] gdb, thread local, libgit, instruction pointet

Undiscussed Groomed for Male Slavery, One Victim of Many gmkarl+brainwashingandfuckingupthehackerslaves at gmail.com
Sun Aug 7 08:43:35 PDT 2022


how libgit revwalk works:
- when initialised with a glob, it iterates all the refs that match
the glob and forwards to the procedure that initialises with refs
- initialising with refs converts the refs to a name, then to a direct
oid, and hands off to the procedure that initialises with commits
- it looks like the commit initialisation code (just quickly looking
at) likely retains a set of commit objects, and inserts each one to
the set, such that adding a commit twice would be a no-op. not
certain. the oidmap is stored in walk->commits.

(gdb) p commit
$5 = (git_commit_list_node *) 0x555555827308
(gdb) p *commit
$6 = {oid = {id = "B&\352\071\273\306Ν@\367:\005|O\201H\230\276ʲ"},
time = 0, generation = 0, seen = 0, uninteresting = 0,
  topo_delay = 0, parsed = 0, added = 0, flags = 0, in_degree = 0,
out_degree = 0, parents = 0x0}

the "uninteresting" flag appears to be used to hide commits, as if
they are not in the list.
- there are functions called to perform by commit date

Iterating every reference and matching them to a wildcard glob takes
some time here.

I'm imagining making this faster by enumerating just the references,
remotes, branches or such. Ideally remotes, since this maps better to
individual projects and codebases: but since so many tags contain
orphaned code, it could make sense to instead enumerate all
references, not sure. If the remote enumeration code could be combined
without too much delay, then nonrepetition of remotes could be
included.

- when the first item is iterated from the revwalk, it performs
initial parsing. there is a commit graph file it uses.
- if a commit is uninteresting, it appears to mark all its parents as
uninteresting
- commits are tracked whether they have been seen already or not. new
commits are added to the commit list.

this initialisation is in prepare_walk in libgit2/src/revwalk.c:617 or so

- when the revwalk is enumerated, commits are popped off the list and
returned. each popped commit has its parents added back to the list,
to enumerate them.

that seems to be the guts of it. it's what you'd expect, but looking
at the internals helps me plan how to use them differently much more
easily. for example, i did not know there were functions to add
references or commit oids, rather than globs.


More information about the cypherpunks mailing list