On Mon, Jul 29, 2019 at 02:56:34PM +0200, Ævar Arnfjörð Bjarmason wrote: > The thread I started at > https://public-inbox.org/git/87bmhiykvw.fsf@xxxxxxxxxxxxxxxxxxx/ should > also be of interest. I.e. we could have some knobs to create more > "stable" packs, I know rsync does some in-file hashing, but I don't > if/how that works if you have 1 file split into N where some chunks in > the N are in the one file. > > But it's possible to imagine a repacking algorithm that would keep > producing entirely new packs but arrange for it to be ordered/delta'd in > such a way that it optimizes for page-by-page similarity to an older > pack to some degree. I actually think that's the part that rsync does well. We don't keep page-by-page similarity, but rsync (and other tools like borg) are really good at finding the moved chunks. The problem is just that it doesn't know to compare chunks between two files with unrelated names. -Peff