Re: [PATCH] tag: add -i and --introduced modifier for --contains

"Luis R. Rodriguez" <mcgrof@xxxxxxxxxxxxxxxx> · Fri, 18 Apr 2014 16:17:38 -0700

On Thu, Apr 17, 2014 at 10:04 AM, Junio C Hamano <gitster@xxxxxxxxx> wrote:
> "Luis R. Rodriguez" <mcgrof@xxxxxxxxxxxxxxxx> writes:
>
>>> And between v3.4 and v3.5-rc1, the latter is a closer anchor point
>>> for that commit (v3.5-rc1 only needs about 200 hops to reach the
>>> commit, while from v3.4 you would need close to 500 hops),
>>
>> Ah! Thanks for explaining this mysterious puzzle to me. I'm a bit
>> perplexed why still. Can I trouble you for a little elaboration here?

< Junio gives a great huge example>

Phew! Thanks for the elaborate explanation, this makes perfect sense now!

> Now, as to what *SHOULD* happen, I think the above exercise shows us
> a way to define what the desired semantics is, without resorting to
> heuristics (e.g. "which tag has older timestamp?" or "which tag's
> name sorts older under Linux version naming convention?").

I think ultimately this reveals that given that tags *can* be
arbitrary and subjective, and given that clocks can also pretty much
arbitrary 'git describe --contains' can and probably only should do
best effort (TM) and perhaps one thing to help is documenting this
issue well and provide a set of best practices that are supported for
tagging schemes. I can't describe how many libraries I've reviewed
about software versioning schemes and most of them support a huge
array of things, and funny enough the Linux versioning scheme, was not
supported well, for something so simple as versioning sort. This is
ultimately why I had to implement my own sort solution on rel-html. If
we agree on this we could just for example take on the Linux
versioning scheme as an emum and document that well both on code and a
wiki. More on this below.

With regards to timestamps: care must be taken given that we'd be
assuming that clocks are synchronized, this can likely yield incorrect
results on a distributed development environment with different time
zones, and it can also be easily cheated, which is why I was concerned
over using timestamps. Its still certainly something that can be
considered, but I've heard enough rants of a few maintainers about
crazy dates on patches which makes me believe this could actually be
an issue, specially if we speed up development and need higher degree
of resolution.

I know the above example but its perhaps worth mentioning how Linux
does not follow the above development model for merging stable fixes
or changes though, but it does not prevent folks from branching off of
older tags to do development which Linux will then pull. In Ingo's
case the issue then points then I think to another mild issue -- the
commit was developed on a v3.3 based tag, which is why 'git describe
--first-parent c5905afb' yields v3.3-rc1-41-gc5905af and not v3.4,
which *can also* be a bit perplexing if one does not understand the
above example you provided can be used for a development work flow for
code sent out to Linus. That said then, since we don't follow the
model you laid out it still reveals another issue, and I am not yet
sure I still understand why --contains yields a v3.5 tag in that case
since we ensured commits on v3.5 were already piled up on older
releases, or were being introduced newly on its own release. It smells
to me that the commit's first parent (which can be anything) is used
somehow here as a shortcut ?

This doesn't mean we can't use the work flow above for merging changes
from say a v3.4.x onto a v3.5 -- but we don't -- and perhaps as part
of the documentation about a scheme for Linux, we should advise
against such practices. In any case the closest thing I see we can use
upstream on Linux is 'git cherry-pick -x <commit-id>' but Greg doesn't
seem to use this and instead appends the commit with the respective
commit ID of the upstream gitsum. Both strategies yield different
commit IDs anyway, so neither practice should interrupt the 'git
describe --contains' practice. In the stable branches to find out when
a commit was introduced one would not rely on the commit ID on the
stable branch but instead of the commit ID of the 'upstream
reference'.

> Commit A can be described in terms of both v3.4 and v9.0,

And in the real example case, why *would* c5905afb' be be described in
terms of v3.5 instead of v3.4 ?

> and it may
> be closer to v9.0 than v3.4, and under that definition "we pick the
> closest tag", the current "describe --contains" behaviour may be
> correct, but from the human point of view, it is *WRONG*.

Yeap, if a development work flow does not follow a strict pattern
(maybe a .git/config variable?) perhaps 'git describe --contains'
should spit out a the few tags it does have?

> It is wrong because v9.0 can reach v3.4.  So perhaps the rule should
> be updated to do something like:
>
>     - find candidate tags that can be used to "describe --contains"
>       the commit A, yielding v3.4, v3.5 (not shown), and v9.0;

Sure.

>
>     - among the candidate tags, cull the ones that contain another
>       candidate tag, rejecting v3.5 (not shown) and v9.0;

Sounds good to me but that seems to stick the output to a scheme, ie,
would it support schemes without a v prefix for tags? In other words,
perhaps do this only for Linux scheme?

>     - among the surviving tags, pick the closest.
>
> Hmm?

Sounds good to me!

  Luis
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html