On Wed, Oct 03, 2018 at 03:16:39PM +0200, René Scharfe wrote: > Performance of a command that mainly checks for duplicate objects using > an oidset, with master and Clang 6.0.1: > > $ cmd="./git-cat-file --batch-all-objects --unordered --buffer --batch-check='%(objectname)'" > > $ /usr/bin/time $cmd >/dev/null > 0.22user 0.03system 0:00.25elapsed 99%CPU (0avgtext+0avgdata 48484maxresident)k > 0inputs+0outputs (0major+11204minor)pagefaults 0swaps > > $ hyperfine "$cmd" > Benchmark #1: ./git-cat-file --batch-all-objects --unordered --buffer --batch-check='%(objectname)' > > Time (mean ± σ): 250.0 ms ± 6.0 ms [User: 225.9 ms, System: 23.6 ms] > > Range (min … max): 242.0 ms … 261.1 ms > > And with this patch: > > $ /usr/bin/time $cmd >/dev/null > 0.14user 0.00system 0:00.15elapsed 100%CPU (0avgtext+0avgdata 41396maxresident)k > 0inputs+0outputs (0major+8318minor)pagefaults 0swaps > > $ hyperfine "$cmd" > Benchmark #1: ./git-cat-file --batch-all-objects --unordered --buffer --batch-check='%(objectname)' > > Time (mean ± σ): 151.9 ms ± 4.9 ms [User: 130.5 ms, System: 21.2 ms] > > Range (min … max): 148.2 ms … 170.4 ms Thanks. Just to reinforce these timings, a similar command on linux.git drops from 4.0s to 2.3s. The patch itself mostly looks good. A few small comments: > diff --git a/fetch-pack.c b/fetch-pack.c > index 75047a4b2a..a839315726 100644 > --- a/fetch-pack.c > +++ b/fetch-pack.c > @@ -536,7 +536,7 @@ static int tip_oids_contain(struct oidset *tip_oids, > * add to "newlist" between calls, the additions will always be for > * oids that are already in the set. > */ > - if (!tip_oids->map.map.tablesize) { > + if (!tip_oids->set.n_buckets) { > add_refs_to_oidset(tip_oids, unmatched); > add_refs_to_oidset(tip_oids, newlist); > } This is a little intimate with the implementation of khash, but I think it's probably OK (and really no worse than what was there before). As the comment above notes, I think we're really looking at the case where this gets populated on the first call, but not subsequent ones. It might be less hacky to use a "static int initialized" here. Or if we want to avoid hidden globals, put the logic into filter_refs() to decide when to populate. I don't think any of that needs to hold up this series, though. > int oidset_insert(struct oidset *set, const struct object_id *oid) > { > - struct oidmap_entry *entry; > - > - if (!set->map.map.tablesize) > - oidmap_init(&set->map, 0); > - else if (oidset_contains(set, oid)) > - return 1; > - > - entry = xmalloc(sizeof(*entry)); > - oidcpy(&entry->oid, oid); > - > - oidmap_put(&set->map, entry); > - return 0; > + int added; > + kh_put_oid(&set->set, *oid, &added); > + return !added; > } This actually does the check-and-insert in a single lookup, which is nice. ;) > diff --git a/oidset.h b/oidset.h > index 40ec5f87fe..4b90540cd4 100644 > --- a/oidset.h > +++ b/oidset.h > [...] > +KHASH_INIT(oid, struct object_id, int, 0, oid_hash, oid_equal) This will declare these "static inline". Our other major "macros become inline functions" code is commit-slab.h, and there we found it necessary to add MAYBE_UNUSED. I wonder if we ought to be doing the same here (I don't get any warnings, but I suspect sparse might complain). It might be nice if these functions could hide inside oidset.c (and just declare the struct here). It looks like we might be able to do that with __KHASH_TYPE(), but the double-underscore implies that we're not supposed to. ;) I guess we also use a few of them in our inlines here. I'm not 100% sure that oidset_* needs to be inlined either, but this is at least a pretty faithful conversion of the original. -Peff