Shawn Pearce <spearce@xxxxxxxxxxx> writes: > Git does source code well. I don't know enough to judge if DNA/RNA > sequence storage is similar enough to source code to benefit from > things like `git log -p` showing deltas over time, or if some other > algorithm would be more effective. > >> From my understanding the largest problem revolves around git's delta >> discovery method, holding 2 files in memory at once - is there a >> reason this could not be adapted to page/chunk the data in a sliding >> window fashion ? > > During delta discovery Git holds like 11 files in memory at once.... Even though the original question mentioned "delta discovery", I think what was being asked is not "delta" in the Git sense (which your answer is about) but is "can we diff two long sequences of text (that happens to consist of only 4-letter alphabet but that is a irrelevant detail) without holding both in-core in their entirety?", which is a more relevant question/desire from the application point of view. "Is there a reason this could not be adapted?" No, there is no particular reason why this "could not". I think that the only reason we only do in-core diff is because "adapting to page/chunk" hasn't been anybody's high priority list of itches to scratch. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html