Re: [PATCH v12 5/5] read-cache: speed up has_dir_name (part 2)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 08.07.20 um 15:49 schrieb Jeff Hostetler:
>
>
> On 7/6/20 2:39 AM, Junio C Hamano wrote:
>> SZEDER Gábor <szeder.dev@xxxxxxxxx> writes:
>>
>>>> +                 * last: xxx/yy-file (because '-' sorts before '/')
>>>> +                 * this: xxx/yy/abc
>>>
>>> This is problematic, because the index can already contain 'xxx/yy' as
>>> a file, when adding 'xxx/yy/abc', but since 'xxx/yy' as a file sorts
>>> before 'xxx/yy-file', the short-circuiting here doesn't see it and
>>> thus leaves the d-f collision undetected.  Consequently, even Git
>>> porcelain commands can create tree objects with duplicate entries, as
>>> demonstrated in the tests below.
>>
>> Yeah, the "optimization" is quite bogus.  Thanks for catching it.
>>
>
> yes, thanks!

OK, so now we just need to fix it.  Like this perhaps?

-- >8 --
>From f4b2c70a34bb612e2ccc23a31e2ba8e01dedc373 Mon Sep 17 00:00:00 2001
Subject: [PATCH] read-cache: remove bogus shortcut

has_dir_name() has some optimizations for the case where entries are
added to an index in the correct order.  They kick in if the new entry
sorts after the last one.  One of them exits early if the last entry has
a longer name than the directory of the new entry.  Here's its comment:

/*
 * The directory prefix lines up with part of
 * a longer file or directory name, but sorts
 * after it, so this sub-directory cannot
 * collide with a file.
 *
 * last: xxx/yy-file (because '-' sorts before '/')
 * this: xxx/yy/abc
 */

However, a file named xxx/yy would be sorted before xxx/yy-file because
'-' sorts after NUL, so the length check against the last entry is not
sufficient to rule out a collision.  Remove it.

Reported-by: SZEDER Gábor <szeder.dev@xxxxxxxxx>
Suggested-by: SZEDER Gábor <szeder.dev@xxxxxxxxx>
Signed-off-by: René Scharfe <l.s.r@xxxxxx>
---
 read-cache.c | 14 --------------
 1 file changed, 14 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index aa427c5c17..8ed1c29b54 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1171,20 +1171,6 @@ static int has_dir_name(struct index_state *istate,
 				return retval;
 			}

-			if (istate->cache_nr > 0 &&
-				ce_namelen(istate->cache[istate->cache_nr - 1]) > len) {
-				/*
-				 * The directory prefix lines up with part of
-				 * a longer file or directory name, but sorts
-				 * after it, so this sub-directory cannot
-				 * collide with a file.
-				 *
-				 * last: xxx/yy-file (because '-' sorts before '/')
-				 * this: xxx/yy/abc
-				 */
-				return retval;
-			}
-
 			/*
 			 * This is a possible collision. Fall through and
 			 * let the regular search code handle it.
--
2.27.0




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux