Re: [PATCH v3 4/8] unpack-trees: fix nested sparse-dir search

René Scharfe <l.s.r@xxxxxx> · Fri, 20 Aug 2021 21:35:53 +0200

Am 20.08.21 um 17:18 schrieb Derrick Stolee:
> On 8/19/2021 4:01 AM, Johannes Schindelin wrote:
>> Hi Stolee,
>>
>> On Tue, 17 Aug 2021, Derrick Stolee via GitGitGadget wrote:
>>
>>> From: Derrick Stolee <dstolee@xxxxxxxxxxxxx>
>>>
>>> The iterated search in find_cache_entry() was recently modified to
>>> include a loop that searches backwards for a sparse directory entry that
>>> matches the given traverse_info and name_entry. However, the string
>>> comparison failed to actually concatenate those two strings, so this
>>> failed to find a sparse directory when it was not a top-level directory.
>>>
>>> This caused some errors in rare cases where a 'git checkout' spanned a
>>> diff that modified files within the sparse directory entry, but we could
>>> not correctly find the entry.
>>
>> Good explanation.
>>
>> I wonder a bit about the performance impact. How "hot" is this function?
>> I.e. how often is it called, on average?
>>
>> I ask because I see opportunities to optimize in both directions: it could
>> be written more concisely (if speed does not matter as much), and it could
>> be made faster (if speed matters a lot). See below for more.
>
> I would definitely optimize for speed here. This can be a very hot path,
> I believe.
>
>>> +	strbuf_addstr(&full_path, info->traverse_path);
>>> +	strbuf_add(&full_path, p->path, p->pathlen);
>>> +	strbuf_addch(&full_path, '/');
>>
>> This could be reduced to:
>>
>> 	strbuf_addf(&full_path, "%s%.*s/",
>> 		    info->traverse_path, (int)p->pathlen, p->path);
>
> We should definitely avoid formatted strings here, if possible.
>
>> But if speed matters, we probably need something more like this:
>>
>> 	size_t full_path_len;
>> 	const char *full_path;
>> 	char *full_path_1 = NULL;
>>
>> 	if (!*info->traverse_path) {
>> 		full_path = p->path;
>> 		full_path_len = p->pathlen;
>> 	} else {
>> 		size_t len = strlen(info->traverse_path);
>>
>> 		full_path_len = len + p->pathlen + 1;
>> 		full_path = full_path_1 = xmalloc(full_path_len + 1);
>> 		memcpy(full_path_1, info->traverse_path, len);
>> 		memcpy(full_path_1 + len, p->path, p->pathlen);
>> 		full_path_1[full_path_len - 1] = '/';
>> 		full_path_1[full_path_len] = '\0';
>> 	}
>
> The critical benefit here is that we do not need to allocate a
> buffer if the traverse_path does not exist. That might be a
> worthwhile investment. That leads to justifying the use of
> bare 'char *'s instead of 'struct strbuf'.
>
> If the traverse_path is usually non-null, then we could continue using
> strbufs as a helper and get the planned performance gains by using
> strbuf_grow(&full_path, full_path_len + 1) followed by strbuf_add()
> (instead of strbuf_addstr()). That would make this code a bit less
> ugly with the only real overhead being the extra insertions of '\0'
> characters as we add the strings to the strbuf().

You create full_path only to compare it to another string.  You can
compare the pieces directly, without allocating and copying:

	const char *path;

	if (!skip_prefix(ce->name, info->traverse_path, &path) ||
	    strncmp(path, p->path, p->pathlen) ||
	    strcmp(path + p->pathlen, "/"))
		return NULL;

A test would be nice to demonstrate the fixed issue.

René