
karl3@writeme.com wrote:
to clarify i have been working on https://raw.githubusercontent.com/karl3wm/httptransformer/refs/heads/main/ne... which is part of the git repository https://github.com/karl3wm/httptransformer , the functionality of which is presently centered around
I vectorized this check with `(next_hole < tails[idx:])` to perform it for every requested read in a sorted batch. But I don't think this logic is likely to be correct as if the reads are non-overlapping then the start of the next hole will only overlap one of them, and the others will instead be performing an ordering check rather than a check for being uncached. i guess `next_hole < tails[idx:]` finds all the regions that end after the first upcoming hole ie the first one containing a hole if any do, and all following ones :D so maybe uhhh ... `next_hole < tails[idx:] and next_data >offsets_lengths[idx:,0]` where next_data > next_hole ...? :) i wonder if it matters whether one uses aligned or unaligned tails and offsets here o_0 i guess it might only make a difference if there's a consistency problem where the data on disk becomes unaligned ... could be wrong
i think this might find the next region of uncached reads, not sure: `next_hole < tails[idx:] and next_data >offsets_lengths[idx:,0]` then the next region of cached reads might be `tails[idx:] < next_hole ...
! :) :s
a secondary logic concern would be outputting correctly the reads that cannot be cached due to diskspace exhaustion. these need to be output from fetches rather than the mmap structure (which would gives 0s if read for uncached data). so, when adding OP_OUTPUT i'd want to: - output all cached reads from mmap data - only output uncached reads from mmap data if they are going to be written to the cache OP_PLACE says says to write data to the cache, OP_OUTPUT says to output the data, i didn't really polish the ops, basically i'd have it output from the cache if the output operation doesn't have a fetch operation (or optionally/alternatively does have a place operation), and output straight from the fetch instead if the operation has a fetch operation (or optionally/additionally does not have a place operation). this could be made clearer but is just some bit constants