Davide? I'm quoting the whole report, because I suspect you don't follow the git lists, and this is all original libxdiff code. Jim: the annotation failure _may_ just be due to a "valid" diff change (there is not always a unique correct answer for a diff, and so two different diff algorithms can validly give two different answers, which will also mean that git-annotate/blame would give different explanations). But it could certainly also be that you just broke the diffs entirely, so I would like to wait for Davide to comment on your diff before Junio should apply it. Others may be intimately familiar with the diff algorithms too, of course. It just scares me personally ;) Linus On Mon, 16 Oct 2006, Jim Meyering wrote: > > [using very latest code, built an hour ago: > git version 1.4.3.rc3.gb32db-dirty ] > > I found that git-diff-tree is *very* slow when processing files with > many changes. The offending example involves comparing the configure > file from coreutils-6.3 with that from the latest coreutils development > sources. Both are over 50k lines long, and the diff -u output is almost > 50k-lines long, divided into ~700 hunks. > > http://meyering.net/code/git-perf/configure.gz > http://meyering.net/code/git-perf/configure-curr.gz > > Comparing them with "diff -u" takes about 0.3s. > Putting them in a git repo (uncompressed, and with the same name, > of course) and comparing with git-diff-tree takes over a minute. > That's 200 times slower. > > I traced the problem to xdiff/xprepare.c's xdl_cleanup_records function. > Each of its two doubly-nested loops ends up running the inner-loop > code more than 2 *billion* times. > > That seems to be due to the two typos fixed by this patch: > With this patch, my "git-diff-tree --no-commit-id -r -p 2c2172" > command completes in just 2 seconds. > > diff --git a/xdiff/xprepare.c b/xdiff/xprepare.c > index 1be7b31..e5438a9 100644 > --- a/xdiff/xprepare.c > +++ b/xdiff/xprepare.c > @@ -381,7 +381,7 @@ static int xdl_cleanup_records(xdfile_t > hav = (*recs)->ha; > rhi = (long) XDL_HASHLONG(hav, xdf2->hbits); > for (nm = 0, rec = xdf2->rhash[rhi]; rec; rec = rec->next) > - if (rec->ha == hav && ++nm == mlim) > + if (rec->ha == hav || ++nm == mlim) > break; > dis1[i] = (nm == 0) ? 0: (nm >= mlim) ? 2: 1; > } > @@ -392,7 +392,7 @@ static int xdl_cleanup_records(xdfile_t > hav = (*recs)->ha; > rhi = (long) XDL_HASHLONG(hav, xdf1->hbits); > for (nm = 0, rec = xdf1->rhash[rhi]; rec; rec = rec->next) > - if (rec->ha == hav && ++nm == mlim) > + if (rec->ha == hav || ++nm == mlim) > break; > dis2[i] = (nm == 0) ? 0: (nm >= mlim) ? 2: 1; > } > > However, that change causes the t800*.sh (14,16,18) annotate/blame > tests to fail, so take it with a grain of salt. > > I.e., running one of them manually gave this: > > $ sh t8001-annotate.sh --immediate --verbose > * expecting success: check_count A 2 B 1 B1 2 B2 1 "A U Thor" 1 > Author A (expected 2, attributed 2) good > Author B1 (expected 2, attributed 1) bad > Author A U Thor (expected 1, attributed 2) bad > Author B2 (expected 1, attributed 1) good > Author B (expected 1, attributed 1) good > * FAIL 14: Two lines blamed on A, one on B, two on B1, one on B2, one on A U Thor > check_count A 2 B 1 B1 2 B2 1 "A U Thor" 1 > [Exit 1] > - > To unsubscribe from this list: send the line "unsubscribe git" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html > - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html