On 3/6/2023 8:05 PM, Taylor Blau wrote: > On Mon, Mar 06, 2023 at 02:06:37PM +0000, Derrick Stolee via GitGitGadget wrote: >> Signed-off-by: Derrick Stolee <derrickstolee@xxxxxxxxxx> > > Having read and worked with this code before, I don't have a ton of > substance to add here. But it was interesting to reread, and I left a > few sprinklings here and there of some thing that we may want to > consider for v2. > > Before that, though, IIRC we wrote most of this together, so I would be > happy to have my: > > Co-authored-by: Taylor Blau <me@xxxxxxxxxxxx> > Signed-off-by: Taylor Blau <me@xxxxxxxxxxxx> > > above your S-o-b here. But you've done so much work since we originally > wrote this together that I don't mind being dropped here. Up to you :-). Sounds good. Sorry for forgetting that collaboration. >> +static void insert_no_dup(struct prio_queue *queue, struct commit *c) >> +{ >> + if (c->object.flags & PARENT2) >> + return; >> + prio_queue_put(queue, c); >> + c->object.flags |= PARENT2; >> +} > > You mentioned this in the patch message, but: > > It may be worth noting here (or in the call to repo_clear_commit_marks() > below) that the PARENT2 flag is used to detect and avoid duplicates in > this list. I'll add a comment before we clear the bits, to be clear about why we need each bit. >> + while (queue_has_nonstale(&queue)) { >> + struct commit *c = prio_queue_get(&queue); >> + struct commit_list *p; >> + struct bitmap *bitmap_c = init_bit_array(c, width); >> + >> + for (i = 0; i < counts_nr; i++) { >> + int reach_from_tip = bitmap_get(bitmap_c, counts[i].tip_index); >> + int reach_from_base = bitmap_get(bitmap_c, counts[i].base_index); > > Since we're XORing these, I'd hate to get bit by bitmap_get returning > something other than 0 or 1. It doesn't, since the return value (for any > "pos" for which it holds that `EWAH_BLOCKI(pos) < self->word_alloc`) is: > I wonder if we might do a little of belt-and-suspenders here by calling > these like: > > int reach_from_tip = !!(bitmap_get(bitmap_c, counts[i].tip_index)); > int reach_from_base = !!(bitmap_get(bitmap_c, counts[i].base_index)); > > where the "!!(...)" is new. Can't hurt. >> + if (bitmap_popcount(bitmap_p) == commits_nr) >> + p->item->object.flags |= STALE; >> + >> + insert_no_dup(&queue, p->item); > > Do we care about inserting p->item when the above condition is met? IOW, > would it be OK to instead write: > > if (bitmap_popcount(bitmap_p) == commits_nr) > p->item->object.flags |= STALE; > else > insert_no_dup(&queue, p->item); We need to push p->item to the queue, even if stale, because it may need to be walked in order to pass the bitmaps (and the STALE bit) to other commits that were reached by only a subset of the tips. Here's an example: A B |/ \ C D | / E / |/ F If A and B are the starting commits, then C is stale when we walk it, so its parent E would be stale. Your proposed change would not add it to the queue, and thus F would never see that it is stale and would be counted as reachable from B but not A. >> diff --git a/commit-reach.h b/commit-reach.h >> index 148b56fea50..1780f9317bf 100644 >> --- a/commit-reach.h >> +++ b/commit-reach.h >> @@ -104,4 +104,34 @@ struct commit_list *get_reachable_subset(struct commit **from, int nr_from, >> struct commit **to, int nr_to, >> unsigned int reachable_flag); >> >> +struct ahead_behind_count { >> + /** >> + * As input, the *_index members indicate which positions in >> + * the 'tips' array correspond to the tip and base of this >> + * comparison. >> + */ >> + size_t tip_index; >> + size_t base_index; >> + >> + /** >> + * These values store the computed counts for each side of the >> + * symmetric difference: >> + * >> + * 'ahead' stores the number of commits reachable from the tip >> + * and not reachable from the base. >> + * >> + * 'behind' stores the number of commits reachable from the base >> + * and not reachable from the tip. >> + */ >> + int ahead; >> + int behind; >> +}; > > Should these be unsigned values? I don't think we have a sensible > interpretation for what a negative "ahead" or "behind" could would mean. > I guess behind "behind" by "N" means you're "ahead" by "-N", but I don't > think it's practical ;-). Unsigned sounds good. >> + >> +/** > > Here and elsewhere, these kind of doc-comments are a little > non-standard, and IIRC the opening should instead be "/*" (with one > asterisk instead of two). I think double-asterisk is the preferred choice for new things, but commit-reach.h only uses a single asterisk so I'll change this to be consistent. Thanks, -Stolee