Am 12.10.20 um 22:39 schrieb Barret Rhoden: > Hi - > > On 10/11/20 12:03 PM, René Scharfe wrote: > [snip] >>> Any performance improvement would be welcome. I haven't looked at >>> the code in a while, but I don't recall any reasons why this wouldn't >>> work. >> >> Using a commit flag instead of an oidset would only improve >> performance noticeably if the product of the number of suspects and >> ignored commits was huge, I guess. >> >> I get weird timings for an ignore file containing basically all commits >> (created with "git log --format=%H"). With Git's own repo and rc1: >> >> Benchmark #1: ./git-blame --ignore-revs-file hashes Makefile >> Time (mean ± σ): 8.470 s ± 0.049 s [User: 7.923 s, System: 0.547 s] >> Range (min … max): 8.434 s … 8.605 s 10 runs >> >> And with the patch at the bottom: >> >> Benchmark #1: ./git-blame --ignore-revs-file hashes Makefile >> Time (mean ± σ): 8.048 s ± 0.061 s [User: 7.899 s, System: 0.146 s] >> Range (min … max): 7.987 s … 8.175 s 10 runs >> >> That looks like a nice speedup, but why for system time alone? Malloc >> overhead perhaps? > > Hard to say. Maybe page faults when walking the old ignore_list? brk(2) calls. strace -c says that rc1 has 21657 of them and the patch gets that down to 8132. They dominate system time in both cases. > >> Anyway, here's the patch: > > Looks good to me. > > Barret > > >> --- >> blame.c | 2 +- >> blame.h | 5 +++-- >> builtin/blame.c | 16 ++++++++++++---- >> object.h | 3 ++- >> 4 files changed, 18 insertions(+), 8 deletions(-) >> >> diff --git a/blame.c b/blame.c >> index 686845b2b4..6e8c8fec9b 100644 >> --- a/blame.c >> +++ b/blame.c >> @@ -2487,7 +2487,7 @@ static void pass_blame(struct blame_scoreboard *sb, struct blame_origin *origin, >> /* >> * Pass remaining suspects for ignored commits to their parents. >> */ >> - if (oidset_contains(&sb->ignore_list, &commit->object.oid)) { >> + if (commit->object.flags & BLAME_IGNORE) { >> for (i = 0, sg = first_scapegoat(revs, commit, sb->reverse); >> i < num_sg && sg; >> sg = sg->next, i++) { >> diff --git a/blame.h b/blame.h >> index b6bbee4147..d35167e8bd 100644 >> --- a/blame.h >> +++ b/blame.h >> @@ -16,6 +16,9 @@ >> #define BLAME_DEFAULT_MOVE_SCORE 20 >> #define BLAME_DEFAULT_COPY_SCORE 40 >> >> +/* Remember to update object flag allocation in object.h */ >> +#define BLAME_IGNORE (1u<<14) >> + >> struct fingerprint; >> >> /* >> @@ -125,8 +128,6 @@ struct blame_scoreboard { >> /* linked list of blames */ >> struct blame_entry *ent; >> >> - struct oidset ignore_list; >> - >> /* look-up a line in the final buffer */ >> int num_lines; >> int *lineno; >> diff --git a/builtin/blame.c b/builtin/blame.c >> index bb0f29300e..1c6721b5d5 100644 >> --- a/builtin/blame.c >> +++ b/builtin/blame.c >> @@ -830,21 +830,29 @@ static void build_ignorelist(struct blame_scoreboard *sb, >> { >> struct string_list_item *i; >> struct object_id oid; >> + const struct object_id *o; >> + struct oidset_iter iter; >> + struct oidset ignore_list = OIDSET_INIT; >> >> - oidset_init(&sb->ignore_list, 0); >> for_each_string_list_item(i, ignore_revs_file_list) { >> if (!strcmp(i->string, "")) >> - oidset_clear(&sb->ignore_list); >> + oidset_clear(&ignore_list); >> else >> - oidset_parse_file_carefully(&sb->ignore_list, i->string, >> + oidset_parse_file_carefully(&ignore_list, i->string, >> peel_to_commit_oid, sb); >> } >> for_each_string_list_item(i, ignore_rev_list) { >> if (get_oid_committish(i->string, &oid) || >> peel_to_commit_oid(&oid, sb)) >> die(_("cannot find revision %s to ignore"), i->string); >> - oidset_insert(&sb->ignore_list, &oid); >> + oidset_insert(&ignore_list, &oid); >> } >> + oidset_iter_init(&ignore_list, &iter); >> + while ((o = oidset_iter_next(&iter))) { >> + struct commit *commit = lookup_commit(sb->repo, o); >> + commit->object.flags |= BLAME_IGNORE; >> + } >> + oidset_clear(&ignore_list); Without this cleanup the number of brk(2) calls goes up to 24071 for me, increasing system time beyond the one for rc1. So it seems the improvement comes from allocating a few MB (the oidset) and releasing it again to pre-size the heap and avoid thousands of system calls that would otherwise extend it lazily. The patch below on top of rc1 simulates this. New day, new numbers, this time with less background programs; here are the times for rc1: Benchmark #1: ./git-blame --ignore-revs-file=hashes Makefile Time (mean ± σ): 8.210 s ± 0.020 s [User: 7.647 s, System: 0.558 s] Range (min … max): 8.182 s … 8.258 s 10 runs And here with the patch at the bottom: Benchmark #1: ./git-blame --ignore-revs-file=hashes Makefile Time (mean ± σ): 7.879 s ± 0.023 s [User: 7.827 s, System: 0.052 s] Range (min … max): 7.859 s … 7.936 s 10 runs My conclusion: object flags won last time by cheating -- lookup speed isn't really all that different, what matters is allocation overhead. Extending the heap by just a few MB helps a lot. Which is very likely to be a platform-specific (system-specific even?) win. So let's drop this. But it shows that a better direction for improving performance might be to reduce the number of allocations, e.g. by using a mem_pool. >> } >> >> int cmd_blame(int argc, const char **argv, const char *prefix) >> diff --git a/object.h b/object.h >> index 20b18805f0..6818c9296b 100644 >> --- a/object.h >> +++ b/object.h >> @@ -64,7 +64,8 @@ struct object_array { >> * negotiator/default.c: 2--5 >> * walker.c: 0-2 >> * upload-pack.c: 4 11-----14 16-----19 >> - * builtin/blame.c: 12-13 >> + * blame.c: 14 >> + * builtin/blame.c: 12---14 >> * bisect.c: 16 >> * bundle.c: 16 >> * http-push.c: 11-----14 >> -- >> 2.28.0 >> > --- builtin/blame.c | 1 + 1 file changed, 1 insertion(+) diff --git a/builtin/blame.c b/builtin/blame.c index bb0f29300e..aa6970f452 100644 --- a/builtin/blame.c +++ b/builtin/blame.c @@ -845,6 +845,7 @@ static void build_ignorelist(struct blame_scoreboard *sb, die(_("cannot find revision %s to ignore"), i->string); oidset_insert(&sb->ignore_list, &oid); } + free(xmalloc(10*1000*1000)); } int cmd_blame(int argc, const char **argv, const char *prefix) -- 2.28.0