Re: [PATCH 10/10] get_short_sha1: list ambiguous objects on error

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sep 26, 2016, at 09:36, Linus Torvalds wrote:

On Mon, Sep 26, 2016 at 5:00 AM, Jeff King <peff@xxxxxxxx> wrote:

This patch teaches get_short_sha1() to list the sha1s of the
objects it found, along with a few bits of information that
may help the user decide which one they meant.

This looks very good to me, but I wonder if it couldn't be even more aggressive.

In particular, the only hashes that most people ever use in short form
are commit hashes. Those are the ones you'd use in normal human
interactions to point to something happening.

So when the disambiguation notices that there is ambiguity, but there
is only _one_ commit, maybe it should just have an aggressive mode
that says "use that as if it wasn't ambiguous".

If you have this:

faa23ec9b437812ce2fc9a5b3d59418d672debc1 refs/heads/ambig
7f40afe646fa3f8a0f361b6f567d8f7d7a184c10 refs/tags/ambig

and you do this:

$ git rev-parse ambig
warning: refname 'ambig' is ambiguous.
7f40afe646fa3f8a0f361b6f567d8f7d7a184c10

Git automatically prefers the tag over the branch, but it does spit out a warning.

And then have an explicit command (or flag) to do disambiguation for
when you explicitly want it.

I think you don't even need that. Git already does disambiguation for ref names, picks one and spits out a warning.

Why not do the same for short hash names when it makes sense?

Rationale: you'd never care about short forms for tags. You'd just use
the tag name. And while blob ID's certainly show up in short form in
diff output (in the "index" line), very few people will use them. And
tree hashes are basically never seen outside of any plumbing commands
and then seldom in shortened form.

So I think it would make sense to default to a mode that just picks
the commit hash if there is only one such hash. Sure, some command
might want a "treeish", but a commit is still more likely than a tree
or a tag.

But regardless, this series looks like a good thing.

I like it too.

But perhaps it makes sense to actually pick one if there's only one disambiguation of the type you're looking for.

For example given:

235234a blob
2352347 tag
235234f tree
2352340 commit

If you are doing "git cat-file blob 235234" it should pick the blob and spit out a warning (and similarly for other cat-file types). But "git cat-file -p 235234" would give the fatal error with the disambiguation hints because it wants type "any".

If you are doing "git show 235234" it should pick the tag (if it peels to a committish) because Git has already set a precedent of preferring tags over commits when it disambiguates ref names and otherwise pick the commit.

Lets consider this approach using the stats for the Linux kernel:

Ambiguous prefix length 7 counts:
  prefixes:   44733
   objects:   89766

Ambiguous length 11 (but not at length 12) info:
  prefixes:       2
                  0 (with 1 or more commit disambiguations)

Ambiguous length 10 (but not at length 11) info:
  prefixes:      12
                  3 (with 1 or more commit disambiguations)
                  0 (with 2 or more commit disambiguations)

Ambiguous length 9 (but not at length 10) info:
  prefixes:     186
                 43 (with 1 or more commit disambiguations)
                  1 (with 2 or more commit disambiguations)

Ambiguous length 8 (but not at length 9) info:
  prefixes:    2723
                651 (with 1 or more commit disambiguations)
                 40 (with 2 or more commit disambiguations)

Ambiguous length 7 (but not at length 8) info:
  prefixes:   41864
               9842 (with 1 or more commit disambiguations)
                680 (with 2 or more commit disambiguations)

Of the 44733 ambiguous length 7 prefixes, only about 10539 of them disambiguate into one or more commit objects.

But if we apply the "spit a warning and prefer a commit object if there's only one and you're looking for a committish" rule, that drops the number from 10539 to about 721. In other words, only about 7% of the previously ambiguous short commit SHA1 prefixes would continue to be ambiguous at length 7. In fact it almost makes a prefix length of 9 good enough, there's just the one at length 9 that disambiguates into more than one commit (45f014c52).

--Kyle



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]