On Sep 26, 2016, at 09:36, Linus Torvalds wrote:
On Mon, Sep 26, 2016 at 5:00 AM, Jeff King <peff@xxxxxxxx> wrote:
This patch teaches get_short_sha1() to list the sha1s of the
objects it found, along with a few bits of information that
may help the user decide which one they meant.
This looks very good to me, but I wonder if it couldn't be even more
aggressive.
In particular, the only hashes that most people ever use in short form
are commit hashes. Those are the ones you'd use in normal human
interactions to point to something happening.
So when the disambiguation notices that there is ambiguity, but there
is only _one_ commit, maybe it should just have an aggressive mode
that says "use that as if it wasn't ambiguous".
If you have this:
faa23ec9b437812ce2fc9a5b3d59418d672debc1 refs/heads/ambig
7f40afe646fa3f8a0f361b6f567d8f7d7a184c10 refs/tags/ambig
and you do this:
$ git rev-parse ambig
warning: refname 'ambig' is ambiguous.
7f40afe646fa3f8a0f361b6f567d8f7d7a184c10
Git automatically prefers the tag over the branch, but it does spit
out a warning.
And then have an explicit command (or flag) to do disambiguation for
when you explicitly want it.
I think you don't even need that. Git already does disambiguation for
ref names, picks one and spits out a warning.
Why not do the same for short hash names when it makes sense?
Rationale: you'd never care about short forms for tags. You'd just use
the tag name. And while blob ID's certainly show up in short form in
diff output (in the "index" line), very few people will use them. And
tree hashes are basically never seen outside of any plumbing commands
and then seldom in shortened form.
So I think it would make sense to default to a mode that just picks
the commit hash if there is only one such hash. Sure, some command
might want a "treeish", but a commit is still more likely than a tree
or a tag.
But regardless, this series looks like a good thing.
I like it too.
But perhaps it makes sense to actually pick one if there's only one
disambiguation of the type you're looking for.
For example given:
235234a blob
2352347 tag
235234f tree
2352340 commit
If you are doing "git cat-file blob 235234" it should pick the blob
and spit out a warning (and similarly for other cat-file types). But
"git cat-file -p 235234" would give the fatal error with the
disambiguation hints because it wants type "any".
If you are doing "git show 235234" it should pick the tag (if it peels
to a committish) because Git has already set a precedent of preferring
tags over commits when it disambiguates ref names and otherwise pick
the commit.
Lets consider this approach using the stats for the Linux kernel:
Ambiguous prefix length 7 counts:
prefixes: 44733
objects: 89766
Ambiguous length 11 (but not at length 12) info:
prefixes: 2
0 (with 1 or more commit disambiguations)
Ambiguous length 10 (but not at length 11) info:
prefixes: 12
3 (with 1 or more commit disambiguations)
0 (with 2 or more commit disambiguations)
Ambiguous length 9 (but not at length 10) info:
prefixes: 186
43 (with 1 or more commit disambiguations)
1 (with 2 or more commit disambiguations)
Ambiguous length 8 (but not at length 9) info:
prefixes: 2723
651 (with 1 or more commit disambiguations)
40 (with 2 or more commit disambiguations)
Ambiguous length 7 (but not at length 8) info:
prefixes: 41864
9842 (with 1 or more commit disambiguations)
680 (with 2 or more commit disambiguations)
Of the 44733 ambiguous length 7 prefixes, only about 10539 of them
disambiguate into one or more commit objects.
But if we apply the "spit a warning and prefer a commit object if
there's only one and you're looking for a committish" rule, that drops
the number from 10539 to about 721. In other words, only about 7% of
the previously ambiguous short commit SHA1 prefixes would continue to
be ambiguous at length 7. In fact it almost makes a prefix length of
9 good enough, there's just the one at length 9 that disambiguates
into more than one commit (45f014c52).
--Kyle