On Tue, Oct 24, 2023 at 7:45 PM Junio C Hamano <gitster@xxxxxxxxx> wrote: > I would suspect that swapping if/else would make it easier to > follow. Everybody else in the patch guards the use of the oidset > with "were we told not to die on missing objects?", i.e., > > if (revs->do_not_die_on_missing_objects) > oidset_insert(&revs->missing_objects, &p->object.oid); > else > return -1; /* corrupt repository */ > Makes sense. Will change. > > @@ -3800,6 +3809,9 @@ int prepare_revision_walk(struct rev_info *revs) > > FOR_EACH_OBJECT_PROMISOR_ONLY); > > } > > > > + if (revs->do_not_die_on_missing_objects) > > + oidset_init(&revs->missing_objects, 0); > > I read the patch to make sure that .missing_objects oidset is used > only when .do_not_die_on_missing_objects is set and the oidset is > untouched unless it is initialized. Well done. > > I know I floated "perhaps oidset can replace the object bits, > especially because the number of objects that need marking is > expected to be small", but I am curious what the performance > implication of this would be. Is this something we can create > comparison easily? I did a simple comparision here, I randomly deleted commits which had child commits with greater than 2 parents. $ git rev-list --missing=print HEAD | grep "?" | wc -l 828 Using the flag bit: $ hyperfine -w 3 "~/git/bin-wrappers/git rev-list --missing=allow-any HEAD" Benchmark 1: ~/git/bin-wrappers/git rev-list --missing=allow-any HEAD Time (mean ± σ): 860.5 ms ± 15.2 ms [User: 375.3 ms, System: 467.5 ms] Range (min … max): 835.2 ms … 881.0 ms 10 runs Using the oidset: $ hyperfine -w 3 "~/git/bin-wrappers/git rev-list --missing=allow-any HEAD" Benchmark 1: ~/git/bin-wrappers/git rev-list --missing=allow-any HEAD Time (mean ± σ): 901.3 ms ± 9.6 ms [User: 394.3 ms, System: 488.3 ms] Range (min … max): 885.0 ms … 913.2 ms 10 runs Its definitely slower, but not by much. > > > I noticed that nobody releases the resource held by this new oidset. > > Shouldn't we do so in revision.c:release_revisions()? > > It seems that linux-leaks CI job noticed the same. > > https://github.com/git/git/actions/runs/6633599458/job/18021612518#step:5:2949 > > I wonder if the following is sufficient? > > revision.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git c/revision.c w/revision.c > index 724a116401..7a67ff74dc 100644 > --- c/revision.c > +++ w/revision.c > @@ -3136,6 +3136,8 @@ void release_revisions(struct rev_info *revs) > clear_decoration(&revs->merge_simplification, free); > clear_decoration(&revs->treesame, free); > line_log_free(revs); > + if (revs->do_not_die_on_missing_objects) > + oidset_clear(&revs->missing_objects); > } > > static void add_child(struct rev_info *revs, struct commit *parent, struct commit *child) > Yup, this seems right and was missed, will add. On Wed, Oct 25, 2023 at 8:40 AM Patrick Steinhardt <ps@xxxxxx> wrote: > > > > > if (commit->object.flags & ADDED) > > return 0; > > + if (revs->do_not_die_on_missing_objects && > > + oidset_contains(&revs->missing_objects, &commit->object.oid)) > > Nit: indentation is off here. > Thanks, will fix. > > + > > + /* Missing objects to be tracked without failing traversal. */ > > + struct oidset missing_objects; > > As far as I can see we only use this set to track missing commits, but > none of the other objects. The name thus feels a bit misleading to me, > as a reader might rightfully assume that it contains _all_ missing > objects after the revwalk. Should we rename it to `missing_commits` to > clarify? > Fair enough, I was thinking of being future compatible. But probably best to be specific now and refactor as needed in the future. Will change. Thanks both for the review!