On Tue, 11 Dec 2007, Linus Torvalds wrote: > > Libxdiff already has a xdl_trim_ends() that strips all the common > > beginning and ending records, but at that point files are already loaded. > > That's not the problem. The problem with xdl_trim_ends() is that it > happens *after* you have done all the hashing, so as an optimization it's > fairly useless, because it still leaves the real cost (the per-line > hashing) on the table. Careful. The real cost of diffing, is not the O(1) pass of the prepare phase. It's the potentially O(N*M) worst case of the cross-record compare. So that optimization is far from useless. That optimization is indeed mainly targeted to avoid such worst case. > So doing the trimming of the ends before you do even that, allows you to > just do the trivial "let's see if the ends are identical" with a plain > memcmp, which is much faster. Yes, tail trimming done on a block-basis is faster and does not consume memory. The code for libxdiff would have to be a bit more complex though, since memory files can be composed by many sections, of different sizes (so you cannot just assume it's a single block you're trimming the end). Also, you'd need some code at the end that hands you back at least the N lines you want for context. - Davide - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html