[ot][spam][crazy] log 2: append-only random writes
i've got a lot of inhibition around simplifying and generalising some parts of the last stable one at https://lists.cpunks.org/pipermail/cypherpunks/2022-July/102291.html so maybe i'll try going again with more conceptualisation of a working algorithm - retain an index. when writing, merge the region written to into the index, with 0 depth - have nodes track the depth of their deepest leaf and any other information needed about them to perform operations when making a new index or updating: - index regions that are at max_depth are copied in so as to retain max_depth - regions that are trimmed by overwrites should be checked to see if any subnodes are completely overwritten, and have their depth updated to a shallower value if they are - regions can be merged with their neighbors into a reference to the shallowest shared parent, including the last flush as the most shallow parent if such a merge would not exceed the max depth. - the merging of index regions is the same as the merging of writes. iterating leaves should no longer be needed for operation, and could be simplified into a recursive index iteration. the boilerplate may be mostly reusable.
class Flush: def __init__(self, parent = None): self.parent = parent if self.parent is not None: self.index = [self.parent] else: self.index = [] def write(self, # index updating can happen all inside write method
class Data: def __init__(self, offset, data, end = None): self.start = offset self.data = data if end is None: self.end = self.start + len(self.data) class Flush: def __init__(self, parent = None): self.parent = parent if self.parent is not None: self.index = [self.parent] # this should expand self.parent if it's too deep, and shallow it if there is only one child else: self.index = [] def write(self, offset, data): # 1. trim index to left and right of data # 2. remove items in between and replace with data
member variables draft. i think this is all the information that was being replaced by leaf and index iteration. - leaf count can be updated by summing children when graph is changed. - height is simply maximum child height + 1, can also update when graph is changed - parents is for adding dropped nodes to when branches are made more shallow. it can be prefix-compared against to find a shared ancestor when unifying neighboring entries. this shouldn't need an update for graph changes since the nontrimmed data is immutable. class Data: def __init__(self, offset, data, end = None, height = 0, parents = [], leaves = 1): self.start = offset self.data = data if end is None: self.end = self.start + len(self.data) self.height = 0 self.parents = parents self.leaves = 1
last stable version was at https://lists.cpunks.org/pipermail/cypherpunks/2022-July/102291.html # member variables draft. i think this is all the information that # was being replaced by leaf and index iteration. # - leaf count can be updated by summing children when graph is # changed. # - height is simply maximum child height + 1, can also update # when graph is changed # - parents is for adding dropped nodes to when branches are made # more shallow. it can be prefix-compared against to find a # shared ancestor when unifying neighboring entries. the only # other update I'm thinking of is when a new flush adds a shared # parent to every node. # it may be simpler or clearer to retain parent references, # and walk up them to the root to form parents lists when # they are needed; unsure class Data: def __init__(self, offset, data, end = None, height = 0, _parents = [], leaves = 1, parent = None): self.start = offset self.data = data if end is None: self.end = self.start + len(self.data) self.height = 0 self.parent = parent self._parents = _parents self.leaves = 1 class Flush: def __init__(self, parent = None): self.parent = parent if self.parent is not None: self.index = [self.parent] # this should expand self.parent if it's too deep, and shallow it if there is only one child # we don't want to expand the index past the max depth else: self.index = [] def write(self, offset, data): # 1. trim index to left and right of data # when trimming, can make the trimmed nodes shallower if parts are removed. removed parents can be added to .parents # if trimming makes a branch shallower at its leaves, it may become mergeable with its other neighbor # 2. remove items in between and replace with data # 3. merge data
last stable version was at https://lists.cpunks.org/pipermail/cypherpunks/2022-July/102291.html
stable version 1 that made an index only when a flush was constructed at https://lists.cpunks.org/pipermail/cypherpunks/2022-July/102291.html attached wip version is behaving consistently and retains a running index updated every write, but makes less optimized trees because i have not implemented branch merging or making branches shallower when they get skinnier, and am using recursion and excessive heap construction to shorten the implementation for my confused mind. I seem to have more inhibition when I try to make dependecy components with suites of methods. very hard to reuse these. so I redid it a different way. Just implemented write merging.
i accidentally attached one of these to my other append-only random thread, and am continuing there
participants (1)
-
Undiscussed Horrific Abuse, One Victim of Many