Re: [PATCH v2 1/3] commit-reach: add get_branch_base_for_tip

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 8/12/24 4:30 PM, Junio C Hamano wrote:
"Derrick Stolee via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes:
Repositories that use pull requests (or merge requests) to advance one or
more "protected" branches, the history of that reference can be recovered by
following the first-parent history in most cases.

I cannot quite parse it, but perhaps "Repositories that" -> "In
repositories that"?

That is an improvement, thanks.

Most are completed using
no-fast-forward merges, though squash merges are quite common. Less common
is rebase-and-merge, which still validates this assumption. Finally, the
case that breaks this assumption is the fast-forward update (with potential
rebasing).  Even in this case, the previous commit commonly appears in the
first-parent history of the branch.

Given current command-line interface options, this optimization criteria is
not easy to detect directly. Even using the command

   git rev-list --count --first-parent <base>..<source>

does not measure this count, as it uses full reachability from <base> to
determine which commits to remove from the range '<base>..<source>'.

Makes me wonder if "--ancestry-path" would help.

One difficulty here is that we don't know the "first-parent merge base"
to supply to the --ancestry-path argument. You could first find this by
running

  git rev-list --first-parent --boundary --reverse A...B

and pulling out the first boundary commit 'C'. Then, that could be used in

 git rev-list  --first-parent --count --ancestry-path=C B

I believe that this two-process-per-ref approach would provide an
existing way to compute these results.

The trickiest part of the integer slab is what happens when reaching a
collision among the histories of the bases and the history of the source.
This is noticed when viewing the first parent and seeing that it has a slab
value that differs in sign (negative or positive). In this case, the
collision commit is stored in the method variable 'branch_point' and its
slab value is set to -1. The index of the best base (so far) is stored in
the method variable 'best_index'. It is possible that there are multiple
commits that have the branch_point as its first parent, leading to multiple
updates of best_index.  The result is determined when 'branch_point' is
visited in the commit walk, giving the guarantee that all commits that could
reach 'branch_point' were visited.

OK.

+/*
+ * This slab initializes integers to zero, so use "-1" for "tip is best" and
+ * "i + 1" for "bases[i] is best".
+ */
+define_commit_slab(best_branch_base, int);
+static struct best_branch_base best_branch_base;
+#define get_best(c) (*best_branch_base_at(&best_branch_base, c))
+#define set_best(c,v) (*best_branch_base_at(&best_branch_base, c) = v)

Micronit.  Prepare for macro arguments to be expressions, even if
current callers don't use anything more complex, i.e., something
like

	(*best_branch_base_at(&best_branch_base, (c)))
	(*best_branch_base_at(&best_branch_base, (c)) = (v))

Thanks. I should have caught this myself.

+
+	/* Initialize queue and slab now that generations are guaranteed. */
+	init_best_branch_base(&best_branch_base);
+	set_best(tip, -1);
+	prio_queue_put(&queue, tip);
+
+	for (size_t i = 0; i < bases_nr; i++) {
+		struct commit *c = bases[i];
+
+		/* Has this already been marked as best by another commit? */
+		if (get_best(c))
+			continue;

Oh, so this defines the tie-breaking behaviour, but simply removing
it is a wrong solution if we wanted our tie-breaking to work as
"last one wins", as we still do not want to put it in the queue, so
this "if best is already found, skip the rest" is serving dual
purposes.  Good.

When trying to make a test case for the for-each-ref behavior around
non-commits, I noticed a bug here. If get_best(c) is -1, then 'c' is
equal to the base and should be selected. I will update the logic here
and add an appropriate test in this patch.

Thanks,
-Stolee





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux