[ot][spam][crazy] in place sort of binary strings ie lines in a file
make simple enough, then do. i have a strong inhibition in the middle of it, and it’s not complicated, so it’s interesting, since inhibitions are so frustrating and it seems far enough from possible reasons to be theoretically separable
ok um offsets = list of all linebreaks then we form list of all byte ranges from the linebreaks
then sort list of byte ranges. now we have a list of starting byte ranges, and a list of the order they go in. now make list of output byte ranges. then cut the lists so that there is no overlap at the edges. the byte ranges turn into sequences. then we can swap the sequences one at a time to reorder the data in place. simple approach. problem remains: cutting them needs to facilitate swapping. need to be clear on what a useful length is. form map of input opto output.
now make list of output byte ranges. this can be done by starting at the start and summing the lengths.
then cut the lists so that there is no overlap at the edges. the byte ranges turn into sequences.
ok um we need the data ….. um
then we can swap the sequences one at a time to reorder the data in place. here also we’ll need to be sure the ranges update when the swap happens
so basically when a swap happens, one range can overlap the other end of the swap. so we can simplify by swapping only part. maybe it makes sense to cut when the swap happens. but maybe for clarity we can cut first. ok ummm so this means looking in one of the inout/output lists, for the
offsets of the other. that can be done with a binary search, which is the bisect module in python. this was probably where the inhibition was stemming from. now it’s migrated to implementation.
then we can swap the sequences one at a time to reorder the data in place. here also we’ll need to be sure the ranges update when the swap happens
so different regions may go to different olaces, si in oython if the data is held as a list of lists, the entries in the lists can be swapped to propagate the references. the input and output data can hold separate items, or identical items. it seems simplest if both lists are swapped, for four total databchanges, but i suppose it makes sense to only swap the inout list, which then represents how the data currently is, and slowly turns into the output list …. i think? kind of. i think a misordered output list. a couple spots unchecked. seems reasonable to try to implement. happy made progress thinking on. not sure ehat to do next.
put it together: offsets = list of all linebreaks then we form list of all byte ranges from the linebreaks then sort list of byte ranges. now make list of output byte ranges. this can be done by starting at the start and summing the lengths. then cut the lists so that there is no overlap at the edges. this means looking in one of the inout/output lists, for the offsets of the other while walking through it. that can be done with a binary search, which is the bisect module in python. the byte ranges turn into sequences. then we can swap the sequences one at a time to reorder the data in place. it makes sense to only swap the input list, which then represents how the data currently is, and slowly turns into a misordered output list. —> a problem remains: how to look up offsets to swap out written areas, when the regions are fragmented,p. hence, the working list is not ordered by lines, but rather is an ordered list of regions by offset, and another list gives the regions for each line. i think you do it with an output list and a working list of ordered regions.
task: swap out region start at offset 0. say we have all the lists. using all of them is similar to considering which one to use.
participants (1)
-
Undescribed Horrific Abuse, One Victim & Survivor of Many