15 Apr
2025
15 Apr
'25
10:31 a.m.
> so omigod i'm part of a partial hospitalization program now. i have to
> go every day. [it's so hard! today will be my second day!! i met a
> bunch of people. i had a rough weekend and totally ignored it and they
> called me right up and had me go! but if i miss 4 days i am out. so i
> gotta keep goin--
i'm trying this morning to implement an update() function for that
dict class that can generalize the __setitem__ function to multiple
items.
i was going to add the same functionality to the other classes and use
it in the dict class, but i'm having a _lot_ of body and mind
takeover, kinda roughly, associated with it, so i shrunk the problem
to just the dict class.
right now my code looks like this:
def update(self, keyhashitemsdict = {}, **keyhashitemskws):
updates = []#{}
spread = 0
hashshift = self._hashshift
hashbytes = self._hashbytes
for keyhashitems in [keyhashitemsdict, keyhashitemskws]:
for keyhash, item in keyhashitems:
assert item != self._sentinel
idx = int.from_bytes(keyhash[:self._hashbytes], 'big')
>> self._hashshift
place = self.array[idx]
if place != self._sentinel:
collision = self._key(place)
if collision != keyhash:
assert idx ==
int.from_bytes(collision[:self._hashbytes], 'big') >> self._hashshift
while int.from_bytes(keyhash[:hashbytes],
'big') >> hashshift != int.from_bytes(collision[:hashbytes], 'big') >>
hashshift:
spread += 1
hashbits = self._hashbits + spread
expansion = 1 << spread
hashbytes = (hashbits+7) >> 3
hashshift = (hashbytes << 3) - hashbits
#if spread == 0:
# updates.append([idx, item])
updates.append([idx, keyhash, item])
updates.sort(reverse=True)
if spread == 0:
allocsz = self._rep._allocsize
itemsz = self._itemsize
update_chunks = [[updates.pop()]]
while len(updates):
update = updates.pop()
if (update[0] + 1 - update_chunks[-1][-1][0]) * itemsz
>= allocsize:
update_chunks.append([update])
else:
update_chunks[-1].append(update)
for update_chunk in update_chunks:
if len(update_chunk) == 1:
idx, keyhash, item = update_chunk[0]
self.array[idx] = item
else:
min_idx, min_keyhash, min_item = update_chunk[0]
max_idx, max_keyhash, max_item = update_chunk[-1]
content = [min_item] +
self.array[min_idx+1:max_idx] + [max_item]
for idx, keyhash, item in update_chunk[1:-1]:
content[idx-min_idx] = item
self.array[min_idx:max_idx+1] = content
update_chunk[:] = []
else:
# big-endian expand, write entire array larger
def content_generator():
# need updates by idx
next_idx, next_keyhash, next_item = updates.pop() if
len(updates) else [float('inf'),None,None]
for superidx, item in enumerate(tqdm.tqdm(self.array,
desc='growing sentinel hashtable', leave=False)):
update_chunk = []
while next_idx == superidx:
keyhash = self._key(item)
wholeidx = int.from_bytes(keyhash[:hashbytes], 'big')
assert superidx == wholeidx >> (hashbytes * 8
- self._hashbits)
subidx = (wholeidx >> hashshift) & expansionmask
assert superidx * expansion + subidx ==
wholeidx >> hashshift
update_chunk.append([next_keyhash, next_item])
next_idx, next_keyhash, next_item =
updates.pop() if len(updates) else [float('inf'),None,None]
if item == self._sentinel:
# fill the section only with update information
else:
# insert the old item in any update information
i had to open it in a gui editor for it to copy right.
i'm working on the second else branch for spread > 0, implementing the
wasteful bigendian high-bits expansion that preserves item order and
keeps the underlying data clearer for a third party to reverse and
independently implement interfacing for. 'slave boss' keeps taking me
over when i try to engage the branch, which is normal for me and i
imagine many others. i've made progress!
i'm using the slower high-bits expansion because it is less
complicating to keep the implementation the same during these issues.
one thing at a time gives me more spaces to sneak coding in through my
experiences.
i'm presently confused around the code duplication of calculting the
subidx and superidx in potentially three places, the old __setitem__
function and the different potential approaches to inserting the old
item among new items here. it doesn't immediately look easy to
generalize to a generally useful concise function (because it engages
both class-local and scope-local data that hasn't been bundled
together yet, and doesn't have use outside this function and its old
implementation).
a good next step might be to either quickly generalize that (even if
the generalization is just a messy artefact of implementation) so as
to make the work simpler and clearer and use less working memory
opening more mental approaches, or to rote copy the code into 3
places. because i tend to insert errors, it's usually better to use
the former approach, but there's a lot of associated values here