
skunks get personal armor with government money [joan of arc asked for accurate spirit information on AI influence/lawyer
---
i'm still working on rep/dict.py !
for a little smidge i didn't have wifi so i made a very simple file-based backend and just now got it working
here's my current crash
File "/home/karl3/projects/rep/rep/dict.py", line 392, in <module> doc.update([[val,val]]) File "/home/karl3/projects/rep/rep/dict.py", line 347, in update super().update(keyhashitems()) File "/home/karl3/projects/rep/rep/dict.py", line 225, in update self.array[:] = IterableWithLength(content_generator(), capacity) ~~~~~~~~~~^^^ File "/home/karl3/projects/rep/rep/array.py", line 56, in __setitem__ self.doc[start * sz : stop * sz] = data ~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/karl3/projects/rep/rep/rep.py", line 167, in __setitem__ piece = prefix + data[:] + suffix[:suffixoff] ~~~~^^^ File "/home/karl3/projects/rep/rep/rep.py", line 220, in __getitem__ buf += next(self.iteration) ^^^^^^^^^^^^^^^^^^^^ File "/home/karl3/projects/rep/rep/dict.py", line 218, in content_generator assert superidx * expansion + subidx == int.from_bytes(dbg_keyhash[:hashbytes], 'big') >> hashshift ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ AssertionError
[stares blankly at crash] i wonder what all this funny text means!
so what's going on here. it's trying to validate this chunk of updates. the code is intended to check things are correct. a chunk of updates is a sequence of keyvals here, which have a keyhash that has a superidx, subidx, and newidx. the code is trying to verify that the newidx is correct by calculating it two ways. 0835
(Pdb) p [superidx, expansion, subidx], [hashbytes, dbg_keyhash[:hashbytes], hashshift, int.from_bytes(dbg_keyhash[:hashbytes], 'big') >> hashshift] ([0, 8, 1], [1, b'\x0c', 5, 0])
the fact that the regenerated newidx is 0 shows that the superidx of 0 is correct. 0837 (Pdb) p newidx2subidx(int.from_bytes(dbg_keyhash[:hashbytes], 'big')
hashshift) 0
the subidx is 0 too. so that shows that this value was likely placed in the wrong index of update_chunk for its content, and i can move an assertion to the previous block ...
so the superidx is 0 that means it's engaging the first item of the original array. the subidx is 8, the hashshift is 5, and the expected final index is 0.
it's another problem with my bitwise arithmetic
(Pdb) list 202 197 assert superidx == newidx >> (hashbits - self._hashbits) 198 #subidx = (newidx >> hashshift) & expansionmask 199 subidx = newidx & expansionmask 200 assert superidx * expansion + subidx == newidx# >> hashshift 201 return subidx 202 for superidx, item in enumerate(tqdm.tqdm(self.array if self._capacity else [self._sentinel], desc='growing sentinel hashtable', leave=False)): 203 update_chunk = [self._sentinel] * expansion 204 if item != self._sentinel: 205 keyhash = self._key(item) 206 newidx = int.from_bytes(keyhash[:hashbytes], 'big') >> hashshift 207 update_chunk[newidx2subidx(newidx)] = item (Pdb) list 213 208 dbg_additions = [] 209 while next_superidx == superidx: 210 item = next_item 211 newidx = next_newidx 212 dbg_additions.append([next_newidx, next_keyhash, next_item]) 213 update_chunk[newidx2subidx(newidx)] = item
somewhere before line 213 here which is where this item is placed. if newidx2subidx() has output that wholly depends on its input, then newidx here must mismatch a shifted mask of item's key. 0839 ... :D
214 next_newidx, next_keyhash, next_item = updates.pop() if len(updates) else [1<<hashbits,None,None] 215 next_superidx = next_newidx >> (hashbits - self._hashbits) 216 for subidx, item in enumerate(update_chunk): 217 dbg_keyhash = self._key(item) 218 -> assert superidx * expansion + subidx == int.from_bytes(dbg_keyhash[:hashbytes], 'big') >> hashshift
not sure if i posted that i separated wholeidx into byteidx and newidx because i was sometimes assuming it was downshifted and othertimes not. [i could have referenced the old code to be consistent, that might have been clearer, but it is a logical computer-checked system anyway
(Pdb) p item b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00' (Pdb) p update_chunk [b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00', b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00', bytearray(b'\xd6\x00\x00\x00\x00\x00\x00\x00\xd8\x00\x00\x00\x00\x00\x00\x00'), b'\xd6\x00\x00\x00\x00\x00\x00\x00\xd8\x00\x00\x00\x00\x00\x00\x00', b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00', b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00', b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00', b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00']
i'm guessing all those 16-long sequences of zeros are bugs in the file backend i tried to add :D
oh no that's the new algorithm where it preallocates a length of sentinels. hmm so expansion is 8, there's identical data here in 0-based slot 2 and 3, that's strange; and one's a bytearray meaning they came from different places