Re: [PATCH v3 09/10] builtin/diff-tree: learn --merge-base

Denton Liu <liu.denton@xxxxxxxxx> · Mon, 21 Sep 2020 14:54:09 -0700

Hi Junio,

On Mon, Sep 21, 2020 at 02:09:24PM -0700, Junio C Hamano wrote:
> Denton Liu <liu.denton@xxxxxxxxx> writes:
> 
> > @@ -165,7 +175,12 @@ int cmd_diff_tree(int argc, const char **argv, const char *prefix)
> >  	case 2:
> >  		tree1 = opt->pending.objects[0].item;
> >  		tree2 = opt->pending.objects[1].item;
> > -		if (tree2->flags & UNINTERESTING) {
> > +		if (merge_base) {
> > +			struct object_id oid;
> > +
> > +			diff_get_merge_base(opt, &oid);
> > +			tree1 = lookup_object(the_repository, &oid);
> > +		} else if (tree2->flags & UNINTERESTING) {
> >  			SWAP(tree2, tree1);
> >  		}
> >  		diff_tree_oid(&tree1->oid, &tree2->oid, "", &opt->diffopt);
> 
> OK.  Handling this in that "case 2" does make sense.
> 
> However.
> 
> The above code as-is will allow something like
> 
>     git diff --merge-base A..B
> 
> and it will be taken the same as
> 
>     git diff --merge-base A B

This does not happen because at the top of diff_get_merge_base(), we
have

	for (i = 0; i < revs->pending.nr; i++) {
		struct object *obj = revs->pending.objects[i].item;
		if (obj->flags)
			die(_("--merge-base does not work with ranges"));
		if (obj->type != OBJ_COMMIT)
			die(_("--merge-base only works with commits"));
	}

which ensures that we don't accept any ranges at all. This is why I
considered the SWAP and merge_base cases to be mutually exclusive.

> Another possibility is to error out when "--merge-base A..B" is
> given, which might be simpler.  Then the code would look more like
> 
> 
> 	tree1 = ...
> 	tree2 = ...
> 
> 	if (merge_base) {
> 		if ((tree1->flags | tree2->flags) & UNINTERESTING)
> 			die(_("use of --merge-base with A..B forbidden"));
> 		... get merge base and assign it to tree1 ...
> 	} else if (tree2->flags & UNINTERESTING) {
> 		SWAP();
> 	}

This is the route I picked, although the logic for this is in
diff_get_merge_base().

> While we are at it, what happens when "--merge-base A...B" is given?
> 
> In the original code without "--merge-base", "git diff-tree A...B"
> places the merge base between A and B in pending.objects[0] and B in
> pending.objects[1], I think.  "git diff-tree --merge-base A...B"
> would further compute the merge base between these two objects, but
> luckily $(git merge-base $(merge-base A B) B) is the same as $(git
> merge-base A B), so you won't get an incorrect answer from such a
> request.  Is this something we want to diagnose as an error?  I am
> inclined to say we should allow it (and if it hurts the user can
> stop doing so) as there is no harm done.

I think that we should error out for all ranges because this option
semantically only really makes sense on two endpoints, not a range of
commits. Since the check is cheap to protect users from themselves, we
might as well actually do it.

Worst case, if someone has a legimitate use case for --merge-base and
ranges, we can allow it in the future, which would be easier than
removing this feature.

Thanks,
Denton