On Tue, 18 Dec 2007, Jan Hudec wrote: > On Tue, Dec 11, 2007 at 11:50:08AM -0800, Linus Torvalds wrote: > > And, btw: the diff is totally different from the xdelta we have, so even > > if we have an already prepared nice xdelta between the two versions, we'll > > end up re-generating the files in full, and then do a diff on the end > > result. > > The problem is whether git does not end-up re-generating the same file > multiple times. When it needs to construct the diff between two versions of > a file and one is delta-base (even indirect) of the other, does it know to > create the first, remember it, continue to the other and calculate the diff? Yes. Actually, it doesn't "know" anything at all - what happens is that git internally has a simple "delta-cache", which just caches the latest objects we've generated from deltas, and which automatically handles this common case (and others). So when we tend to work with multiple versions of the same file (which is obviously very common with diff, and even more so with something like "annotate"), those multiple versions will obviously also tend to be deltas against each other and/or against some shared base object, and when we see a delta, we'll look the base object up in the delta cache, and if it has been generated earlier we'll be able to short-circuit the whole delta chain and just use the whole object we already cached. So if you compare two objects that each have a very deep delta chain, you will obviously have to walk the whole delta chain _once_ (to generate whichever version of the file you happen to look up first), but you won't need to do it twice, because the second time you'll end up hitting in the delta cache. Linus - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html