Re: [PATCH v4] ref-filter: Add --no-contains option to tag/branch/for-each-ref

Junio C Hamano <gitster@xxxxxxxxx> · Sat, 11 Mar 2017 20:44:47 -0800

Ævar Arnfjörð Bjarmason  <avarab@xxxxxxxxx> writes:

> Change the tag, branch & for-each-ref commands to have a --no-contains
> option in addition to their longstanding --contains options.
>
> The use-case I have for this is to find the last-good rollout tag
> given a known-bad <commit>. Right now, given a hypothetically bad
> commit v2.10.1-3-gcf5c7253e0, you can find which git version to revert
> to with this hacky two-liner:
>
>     (./git tag -l 'v[0-9]*'; ./git tag -l 'v[0-9]*' --contains v2.10.1-3-gcf5c7253e0) \
>         |sort|uniq -c|grep -E '^ *1 '|awk '{print $2}' | tail -n 10
>
> But with the --no-contains option you can now get the exact same
> output with:
>
>     ./git tag -l 'v[0-9]*' --no-contains v2.10.1-3-gcf5c7253e0|sort|tail -n 10

This command line, while it may happen to work, logically does not
make much sense.  Move the pattern to the end, i.e.

	git tag -l --no-contains v2.10.1-3-gcf5c7253e0 'v[0-9]*'

Also if an overlong line in an example disturbs you, do not solve it
by omitting SP around pipe.  If you are trying to make the result
readable, pick a readable solution, e.g.

    git tag -l --no-contains v2.10.1-3-gcf5c7253e0 'v[0-9]*' |
    sort | tail -n 10

Oh, drop ./ from ./git while at it ;-)

> The filtering machinery is generic between the tag, branch &
> for-each-ref commands, so once I'd implemented it for tag it was
> trivial to add support for this to the other two.

Also, we tend not to say "I did this, I do that".

	Because the filtering machinery is generic ..., support it
	for all three consistently.

> I'm adding a --without option to "tag" as an alias for --no-contains
> for consistency with --with and --contains. Since we don't even
> document --with anymore (or test it). The --with option is
> undocumented, and possibly the only user of it is Junio[1]. But it's
> trivial to support, so let's do that.

The sentence that begins "Since we don't" is unfinished.  I think
it can safely removed without losing any information (the next
sentence says the same thing).

> Where I'm changing existing documentation lines I'm mainly word
> wrapping at 75 columns to be consistent with the existing style.

Reviewers would appreciate you refrain from doing that in the same
patch.  Do a minimum patch so that the review can concentrate on
what got changed (i.e. contents), followed by a mechanical reflow as
a follow-up, or something like that, would be much nicer to handle.

> Most of the test changes I've made are just doing the inverse of the
> existing --contains tests, with this change --no-contains for tag,
> branch & for-each-ref is just as well tested as the existing
> --contains option.

Again, we tend to try our commits not about "I, my, me".

	Add --no-contains tests for tag, branch and for-each-ref
	that mostly do the inverse of the existing tests we have for
	--contains.

> This is now based on top of pu, which has Jeff King's "fix object flag
> pollution in "tag --contains" series.

Thanks for this note.  I obviously cannot queue on top of 'pu' ;-)
but will fork this topic off of the jk/ref-filter-flags-cleanup
topic.

>  'git for-each-ref' [--count=<count>] [--shell|--perl|--python|--tcl]
>  		   [(--sort=<key>)...] [--format=<format>] [<pattern>...]
>  		   [--points-at <object>] [(--merged | --no-merged) [<object>]]
> -		   [--contains [<object>]]
> +		   [(--contains | --no-contains) [<object>]]

THis notation makes sense.  We have to have one of these but
<object> at the end could be omitted (to default to HEAD).  I guess
the same notation can be used in the log for the other "filtering
implies --list mode for 'git tag'" topic.

> +--no-contains [<commit>]::
> +	Only list tags which don't contain the specified commit (HEAD if
> +	not specified).

Just being curious.  Can we do

	for-each-ref --contains --no-contains 

and have both default to HEAD?  I know that would not make sense as
a set operation, but I am curious what our command line parser
(which is oblivious to what the command is doing) does.  I am guessing
that it would barf saying "--contains" needs a commit but "--no-contains"
is not a commit (which is very sensible)?

> +
>  --points-at <object>::
>  	Only list tags of the given object.

This is not a new issue (and certainly not a problem caused by your
patch), but unlike "--contains", this does not default to HEAD when
<object> is not explicitly given?  It seems a bit inconsistent to me.

> @@ -618,7 +620,7 @@ int cmd_branch(int argc, const char **argv, const char *prefix)
>  	if (!delete && !rename && !edit_description && !new_upstream && !unset_upstream && argc == 0)
>  		list = 1;
>  
> -	if (filter.with_commit || filter.merge != REF_FILTER_MERGED_NONE || filter.points_at.nr)
> +	if (filter.with_commit || filter.no_commit || filter.merge != REF_FILTER_MERGED_NONE || filter.points_at.nr)
>  		list = 1;

OK.

> diff --git a/parse-options.h b/parse-options.h
> index dcd8a0926c..0eac90b510 100644
> --- a/parse-options.h
> +++ b/parse-options.h
> @@ -258,7 +258,9 @@ extern int parse_opt_passthru_argv(const struct option *, const char *, int);
>  	  PARSE_OPT_LASTARG_DEFAULT | flag, \
>  	  parse_opt_commits, (intptr_t) "HEAD" \
>  	}
> -#define OPT_CONTAINS(v, h) _OPT_CONTAINS_OR_WITH("contains", v, h, 0)
> +#define OPT_CONTAINS(v, h) _OPT_CONTAINS_OR_WITH("contains", v, h, PARSE_OPT_NONEG)
> +#define OPT_NO_CONTAINS(v, h) _OPT_CONTAINS_OR_WITH("no-contains", v, h, PARSE_OPT_NONEG)
>  #define OPT_WITH(v, h) _OPT_CONTAINS_OR_WITH("with", v, h, PARSE_OPT_HIDDEN)
> +#define OPT_WITHOUT(v, h) _OPT_CONTAINS_OR_WITH("without", v, h, PARSE_OPT_HIDDEN)

Hmph, perhaps WITH/WITHOUT also do not take "--no-" form hence need OPT_NONEG?

> @@ -1586,11 +1587,11 @@ static enum contains_result contains_tag_algo(struct commit *candidate,
>  }
>  
>  static int commit_contains(struct ref_filter *filter, struct commit *commit,
> -			   struct contains_cache *cache)
> +			   struct commit_list *list, struct contains_cache *cache)
>  {
>  	if (filter->with_commit_tag_algo)
> -		return contains_tag_algo(commit, filter->with_commit, cache) == CONTAINS_YES;
> -	return is_descendant_of(commit, filter->with_commit);
> +		return contains_tag_algo(commit, list, cache) == CONTAINS_YES;
> +	return is_descendant_of(commit, list);
>  }
>  
>  /*
> @@ -1780,13 +1781,17 @@ static int ref_filter_handler(const char *refname, const struct object_id *oid,
>  	 * obtain the commit using the 'oid' available and discard all
>  	 * non-commits early. The actual filtering is done later.
>  	 */
> -	if (filter->merge_commit || filter->with_commit || filter->verbose) {
> +	if (filter->merge_commit || filter->with_commit || filter->no_commit || filter->verbose) {
>  		commit = lookup_commit_reference_gently(oid->hash, 1);
>  		if (!commit)
>  			return 0;
> -		/* We perform the filtering for the '--contains' option */
> +		/* We perform the filtering for the '--contains' option... */
>  		if (filter->with_commit &&
> -		    !commit_contains(filter, commit, &ref_cbdata->contains_cache))
> +		    !commit_contains(filter, commit, filter->with_commit, &ref_cbdata->contains_cache))
> +			return 0;
> +		/* ...or for the `--no-contains' option */
> +		if (filter->no_commit &&
> +		    commit_contains(filter, commit, filter->no_commit, &ref_cbdata->no_contains_cache))
>  			return 0;
>  	}

When asking "--contains A --contains B", we show refs that contain
_EITHER_ A or B.  Two predicates are ORed together, and I think it
makes sense.

When asking "--contains A --no-contains B", we show refs that
contain A but exclude refs that contains B.  Two predicates are
ANDed together, and I think this also makes sense.

When asking "--no-contains A --no-contains B", what should we show?
This implementation makes the two predicates ANDed together [*1*].

The behaviour is sensible, but is it consistent with the way now
existing --no-merged works?

I think the rule is something like:

    A match with any positive selection criterion (like --contains
    A) makes a ref eligible for output, but then a match with any
    negatigve selection criterion (like --no-merged) filters it out.

Is it easy to explain to the users?  Do we need doc updates to
clarify, or does the description for existing --no-merged already
cover this?

Thanks.

[Footnote]

*1* ... because it uses the same commit_contains() machinery that
computes "contains either A or B" used for the first one and then
negates its result.