Alexander Gavrilov <angavrilov@xxxxxxxxx> writes: > This pair of patches aims at increasing performance of copy detection in > blame by avoiding unnecessary comparisons. Note that since I'm new to > this code, I might have misunderstood something. > > There are two cases than I aim to fix: > > 1) Copy detection is done by comparing all outstanding chunks of the > target file to all blobs in the parent. After that, chunks with suitable > matches are split, and comparison is repeated again, until there are no > new matches. The trouble is, chunks that didn't match the first time, > and weren't split, are compared against the same set of blobs again and > again. I add a flag to track that. > > On my test case it decreased blame -C -C time from over 10min to > ~6min; 4min with -C80. > > 2) Chunks are split only if the match scores above a certain > threshold. I understand that a split of an entry cannot score more than > the entry itself. Thus, it is pointless to even try doing costly > comparisons for small entries. > > (Time goes down to 4min; 2min with -C80) Ideas for both patches sound very sane. Will take a deeper look later. Thanks. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html