Re: Minimum git commit abbrev length (Was Re: -tip: origin tree build failure (was: [GIT PULL] ext4 update) for 2.6.37)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Oct 28, 2010 at 10:27 AM, Ted Ts'o <tytso@xxxxxxx> wrote:
> On Thu, Oct 28, 2010 at 07:17:01PM +0200, Ingo Molnar wrote:
>> > Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>> > Yes. Except for the kernel the default git commit abbreviation is
>> > borderline too short. Seven hex-chars can easily alias with a few
>> > more pulls from me: git will not give aliases at the time it gives
>> > a shorthand, but a month or two later the abbreviated commit may
>> > no longer be unique.
>> >
>> > So I suggest using --abbrev=12 or similar.
>>
>> ok. A helper script i use does this:
>>
>>    git log --pretty=format:"%h: %s" $@
>>
>> I have added --abbrev=12. Might make sense to lengthen the %h
>> default in upstream Git as well?
>
> Maybe the right thing to do is add a git config option which allows
> for a configurable minimum git commit abbreviation length?

Yes. The default of 7 (I think) comes from fairly early in git
development, when seven hex digits was a lot (it covers about 250+
million hash values). Back then I thought that 65k revisions was a lot
(it was what we were about to hit in BK), and each revision tends to
be about 5-10 new objects or so, so a million objects was a big
number.

These days, the kernel isn't even the largest git project, and even
the kernel has about 220k revisions (_much_ bigger than the BK tree
ever was) and we are approaching two million objects. At that point,
seven hex digits is still unique for a lot of them, but when we're
talking about just two orders of magnitude difference between number
of objects and the hash size, there _will_ be hash collisions. It's no
longer even close to unrealistic - it happens all the time.

So I suspect we should both increase the default abbrev that was
unrealistically small, _and_ add a way for people to set their own
default per-project in the git config file.

Maybe something like the attached (not necessarily well-thought-out or
well-tested: I also didn't actually change the default, although I
suspect we should up it from 7 to at least 10).

                        Linus
 builtin/describe.c |    2 +-
 cache.h            |    5 +++--
 config.c           |    8 ++++++++
 environment.c      |    1 +
 4 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/builtin/describe.c b/builtin/describe.c
index 43caff2..2d98702 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -20,7 +20,7 @@ static int debug;	/* Display lots of verbose info */
 static int all;	/* Any valid ref can be used */
 static int tags;	/* Allow lightweight tags */
 static int longformat;
-static int abbrev = DEFAULT_ABBREV;
+static int abbrev = 7;	/* NOTE! Not DEFAULT_ABBREV */
 static int max_candidates = 10;
 static int found_names;
 static const char *pattern;
diff --git a/cache.h b/cache.h
index 33decd9..6c28a81 100644
--- a/cache.h
+++ b/cache.h
@@ -540,6 +540,7 @@ extern int trust_executable_bit;
 extern int trust_ctime;
 extern int quote_path_fully;
 extern int has_symlinks;
+extern int minimum_abbrev, default_abbrev;
 extern int ignore_case;
 extern int assume_unchanged;
 extern int prefer_symlink_refs;
@@ -757,8 +758,8 @@ static inline unsigned int hexval(unsigned char c)
 }
 
 /* Convert to/from hex/sha1 representation */
-#define MINIMUM_ABBREV 4
-#define DEFAULT_ABBREV 7
+#define MINIMUM_ABBREV minimum_abbrev
+#define DEFAULT_ABBREV default_abbrev
 
 struct object_context {
 	unsigned char tree[20];
diff --git a/config.c b/config.c
index 4b0a820..474361c 100644
--- a/config.c
+++ b/config.c
@@ -514,6 +514,14 @@ static int git_default_core_config(const char *var, const char *value)
 		return 0;
 	}
 
+	if (!strcmp(var, "core.abbrev")) {
+		int abbrev = git_config_int(var, value);
+		if (abbrev < minimum_abbrev || abbrev > 40)
+			return -1;
+		default_abbrev = abbrev;
+		return 0;
+	}
+
 	if (!strcmp(var, "core.loosecompression")) {
 		int level = git_config_int(var, value);
 		if (level == -1)
diff --git a/environment.c b/environment.c
index de5581f..b98003c 100644
--- a/environment.c
+++ b/environment.c
@@ -15,6 +15,7 @@ int user_ident_explicitly_given;
 int trust_executable_bit = 1;
 int trust_ctime = 1;
 int has_symlinks = 1;
+int minimum_abbrev = 4, default_abbrev = 7;
 int ignore_case;
 int assume_unchanged;
 int prefer_symlink_refs;

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]