put it together: offsets = list of all linebreaks then we form list of all byte ranges from the linebreaks then sort list of byte ranges. now make list of output byte ranges. this can be done by starting at the start and summing the lengths. then cut the lists so that there is no overlap at the edges. this means looking in one of the inout/output lists, for the offsets of the other while walking through it. that can be done with a binary search, which is the bisect module in python. the byte ranges turn into sequences. then we can swap the sequences one at a time to reorder the data in place. it makes sense to only swap the input list, which then represents how the data currently is, and slowly turns into a misordered output list. —> a problem remains: how to look up offsets to swap out written areas, when the regions are fragmented,p. hence, the working list is not ordered by lines, but rather is an ordered list of regions by offset, and another list gives the regions for each line. i think you do it with an output list and a working list of ordered regions.