On 12/8/06, Junio C Hamano <junkio@xxxxxxx> wrote:
> yes, except that it'll compare the whole trees. Could I make it stop > at first mismatch? "-q|--quiet" for git-diff-index perhaps? > It's just not only stat, but also, open, read, mmap (yes, I try to use > it for packs) and close are really slow here as well. That sounds like optimizing for a wrong case -- you expect the index to match HEAD and trying to catch mistakes by detecting a mismatch, right?
I expect the index to differ from HEAD. The test is to avoid the mistake of doing an empty commit.
Having said that, I should point out that it is a low hanging fruit to optimize "diff-index --cached" for cases where index is expected to mostly match HEAD. The current code for "diff-index --cached" reads the whole tree into the index as stage #1 entries (diff-lib.c::run_diff_index), and then compares stage #0 (from the original index contents) and stage #1 (the tree parameter from the command line). Even if you stop at the first mismatch, you would already have paid the overhead to open and read all tree objects before even starting the comparison.
But I don't have to pay for the overhead of comparing all entries, if I can stop at first mismatch and exit with non-0. I think it'd make a difference (at least some difference). But, if we could avoid loading of the entries which will be never compared anyway, the speedup will be of course more substantial...
In 'pu' (jc/diff topic), I have a very generic code to walk the index, working tree and zero or more trees in parallel, taking advantage of cache-tree. If somebody is interested to learn the internals of git, some of the code could be lifted from there and simplified to walk just the index and a single tree, and I think that would optimize "diff-index --cached" quite a bit.
Will try to look at it. - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html