Re: [PATCH v3 4/8] unpack-trees: fix nested sparse-dir search

Derrick Stolee <stolee@xxxxxxxxx> · Fri, 20 Aug 2021 11:18:18 -0400

On 8/19/2021 4:01 AM, Johannes Schindelin wrote:
> Hi Stolee,
> 
> On Tue, 17 Aug 2021, Derrick Stolee via GitGitGadget wrote:
> 
>> From: Derrick Stolee <dstolee@xxxxxxxxxxxxx>
>>
>> The iterated search in find_cache_entry() was recently modified to
>> include a loop that searches backwards for a sparse directory entry that
>> matches the given traverse_info and name_entry. However, the string
>> comparison failed to actually concatenate those two strings, so this
>> failed to find a sparse directory when it was not a top-level directory.
>>
>> This caused some errors in rare cases where a 'git checkout' spanned a
>> diff that modified files within the sparse directory entry, but we could
>> not correctly find the entry.
> 
> Good explanation.
> 
> I wonder a bit about the performance impact. How "hot" is this function?
> I.e. how often is it called, on average?
> 
> I ask because I see opportunities to optimize in both directions: it could
> be written more concisely (if speed does not matter as much), and it could
> be made faster (if speed matters a lot). See below for more.

I would definitely optimize for speed here. This can be a very hot path,
I believe.

>> +	strbuf_addstr(&full_path, info->traverse_path);
>> +	strbuf_add(&full_path, p->path, p->pathlen);
>> +	strbuf_addch(&full_path, '/');
> 
> This could be reduced to:
> 
> 	strbuf_addf(&full_path, "%s%.*s/",
> 		    info->traverse_path, (int)p->pathlen, p->path);

We should definitely avoid formatted strings here, if possible.

> But if speed matters, we probably need something more like this:
> 
> 	size_t full_path_len;
> 	const char *full_path;
> 	char *full_path_1 = NULL;
> 
> 	if (!*info->traverse_path) {
> 		full_path = p->path;
> 		full_path_len = p->pathlen;
> 	} else {
> 		size_t len = strlen(info->traverse_path);
> 
> 		full_path_len = len + p->pathlen + 1;
> 		full_path = full_path_1 = xmalloc(full_path_len + 1);
> 		memcpy(full_path_1, info->traverse_path, len);
> 		memcpy(full_path_1 + len, p->path, p->pathlen);
> 		full_path_1[full_path_len - 1] = '/';
> 		full_path_1[full_path_len] = '\0';
> 	}

The critical benefit here is that we do not need to allocate a
buffer if the traverse_path does not exist. That might be a
worthwhile investment. That leads to justifying the use of
bare 'char *'s instead of 'struct strbuf'.

If the traverse_path is usually non-null, then we could continue using
strbufs as a helper and get the planned performance gains by using
strbuf_grow(&full_path, full_path_len + 1) followed by strbuf_add()
(instead of strbuf_addstr()). That would make this code a bit less
ugly with the only real overhead being the extra insertions of '\0'
characters as we add the strings to the strbuf().

I will need to investigate so see which one is the best.

Thanks,
-Stolee