On Fri, Apr 18, 2014 at 4:36 PM, Junio C Hamano <gitster@xxxxxxxxx> wrote: > "Luis R. Rodriguez" <mcgrof@xxxxxxxxxxxxxxxx> writes: > >> I think ultimately this reveals that given that tags *can* be >> arbitrary and subjective,... > > Yes; see the part at the bottom. > >>> Commit A can be described in terms of both v3.4 and v9.0, >> >> And in the real example case, why *would* c5905afb' be be described in >> terms of v3.5 instead of v3.4 ? > > I am not interested in graphing that particular history between v3.4 > and v3.5 myself. If you are interested, I already gave you enough > information on how to figure that out. I was alluding to another possible issue here, my concern was that the commit's parent (which is not really the point at which it was merged, but rather where the topic got forked off to be worked on) could be used for as reference points but clearly its not given the nature of how name-rev was implemented. I still do see some possible issues with it's parent on other commands (but I haven't studied the other's implementation) that reveals some of my original concerns, but its unclear if they are related. I also found that if we didn't want to rely on dates or start defining naming convention we may want to reconsider the name_rev() recursive implementation. I'll illustrate a few results that might help to show my concerns for both other commands perhaps using the parent erroneously, and a possible alternative implementation for name_rev() or at the very least contains. [0] mcgrof@ergon ~/linux (git::master)$ git log c5905afb..v3.5| grep ^commit | wc -l 24878 [1] mcgrof@ergon ~/linux (git::master)$ git log c5905afb..v3.4| grep ^commit | wc -l 13106 [2] mcgrof@ergon ~/linux (git::master)$ git log c5905afb..v3.3| grep ^commit | wc -l 1360 Now that I revised name_rev.c I see the recursive nature of name_rev() works top down from each tag down to each v* tag object and for each actual commit pegs a name on it. How we rule out each tag under this implementation is not that obvious to me, specially when results like [0] and [1] reveal v3.4 should be 'shorter' in light of number of commits. I see now how we don't update a commit's name if other crucial information such as the ones discussed on this thread might be important for the user, and I can see how this can help but an alternative approach, which is what I expected to see implemented at least for 'git describe --contains', would have been to see how many commits are present from the commit's *merged* upstream parent (not the actual parent as in c5905afb's commit case its v3.3 which is not where it got merged). Getting the smallest number of commits under this logic and stopping when we don't find any commits should yield us the base tag under which the commit was merged, without any heuristics on dates. This however applies to Linux though given that we don't merge commits on stable branches but rather create new commits and reference the upstream sha1sum, a practice which also solves the problem Jeff pointed out. The results for command [2] above however a bit surprising, I'd take a look but I should go back to look at other stuff, figured I'd at least bring it up now as it seems relevant. >>> - find candidate tags that can be used to "describe --contains" >>> the commit A, yielding v3.4, v3.5 (not shown), and v9.0; >> >>> - among the candidate tags, cull the ones that contain another >>> candidate tag, rejecting v3.5 (not shown) and v9.0; >> >>> - among the surviving tags, pick the closest. >>> >>> Hmm? >> >> Sounds good to me! > > Not so fast ;-) > > My other message to Peff in response to his another example has an > updated position on this. "Reject candidates that can reach other > candidates" is universally correct, but after that point, there are > at least three but probably more options that suit preference of > different people and project to break ties: > > - Your case that started this thread may want to favor v3.4 if only > because that v3.4 _sounds_ smaller than v4.0 (in Peff's example), > even when v3.4 and v4.0 do not have ancestry relationship. > > - The "closest" we have had is a heuristic to produce a result that > is textually shorter. > > - And as I alluded to, "which one has the earliest timestamp?", is > another valid question to ask. The first one above can be subjective if and only if the Linux upstream model of dealing with stable branches is not followed. In other words I think its a non issue if you create new commits on the stable branches instead of merge stuff onto them. This however is technical practice and I guess not everyone follows. > And there may be more to appear. A new command line option (and > possibly a new configuration) to choose from these three (and more > heuristics that will be added later) would be necessary. Yeah this is rather complex, the resolutions to the issue in the ways you've described seem reasonable to me but do wonder if this can be simplified by reevaluating how the candidates are considered. You'd know better :) Luis -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html