Re: [PATCH v3 4/8] unpack-trees: fix nested sparse-dir search

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Stolee,

On Tue, 17 Aug 2021, Derrick Stolee via GitGitGadget wrote:

> From: Derrick Stolee <dstolee@xxxxxxxxxxxxx>
>
> The iterated search in find_cache_entry() was recently modified to
> include a loop that searches backwards for a sparse directory entry that
> matches the given traverse_info and name_entry. However, the string
> comparison failed to actually concatenate those two strings, so this
> failed to find a sparse directory when it was not a top-level directory.
>
> This caused some errors in rare cases where a 'git checkout' spanned a
> diff that modified files within the sparse directory entry, but we could
> not correctly find the entry.

Good explanation.

I wonder a bit about the performance impact. How "hot" is this function?
I.e. how often is it called, on average?

I ask because I see opportunities to optimize in both directions: it could
be written more concisely (if speed does not matter as much), and it could
be made faster (if speed matters a lot). See below for more.

>
> Signed-off-by: Derrick Stolee <dstolee@xxxxxxxxxxxxx>
> ---
>  unpack-trees.c | 18 +++++++++++++-----
>  1 file changed, 13 insertions(+), 5 deletions(-)
>
> diff --git a/unpack-trees.c b/unpack-trees.c
> index 5786645f315..df1f4437723 100644
> --- a/unpack-trees.c
> +++ b/unpack-trees.c
> @@ -1255,9 +1255,10 @@ static int sparse_dir_matches_path(const struct cache_entry *ce,
>  static struct cache_entry *find_cache_entry(struct traverse_info *info,
>  					    const struct name_entry *p)
>  {
> -	struct cache_entry *ce;
> +	struct cache_entry *ce = NULL;

Makes sense: since you need to release the allocated memory, you can no
longer `return NULL` early, but have to break out of the loop and return
`ce`.

>  	int pos = find_cache_pos(info, p->path, p->pathlen);
>  	struct unpack_trees_options *o = info->data;
> +	struct strbuf full_path = STRBUF_INIT;
>
>  	if (0 <= pos)
>  		return o->src_index->cache[pos];
> @@ -1273,6 +1274,10 @@ static struct cache_entry *find_cache_entry(struct traverse_info *info,
>  	if (pos < 0 || pos >= o->src_index->cache_nr)
>  		return NULL;
>
> +	strbuf_addstr(&full_path, info->traverse_path);
> +	strbuf_add(&full_path, p->path, p->pathlen);
> +	strbuf_addch(&full_path, '/');

This could be reduced to:

	strbuf_addf(&full_path, "%s%.*s/",
		    info->traverse_path, (int)p->pathlen, p->path);

But if speed matters, we probably need something more like this:

	size_t full_path_len;
	const char *full_path;
	char *full_path_1 = NULL;

	if (!*info->traverse_path) {
		full_path = p->path;
		full_path_len = p->pathlen;
	} else {
		size_t len = strlen(info->traverse_path);

		full_path_len = len + p->pathlen + 1;
		full_path = full_path_1 = xmalloc(full_path_len + 1);
		memcpy(full_path_1, info->traverse_path, len);
		memcpy(full_path_1 + len, p->path, p->pathlen);
		full_path_1[full_path_len - 1] = '/';
		full_path_1[full_path_len] = '\0';
	}

	[...]

	free(full_path_1);

It would obviously be much nicer if we did not have to go for that ugly
long version...

> +
>  	/*
>  	 * Due to lexicographic sorting and sparse directory
>  	 * entries ending with a trailing slash, our path as a
> @@ -1283,17 +1288,20 @@ static struct cache_entry *find_cache_entry(struct traverse_info *info,
>  	while (pos >= 0) {
>  		ce = o->src_index->cache[pos];
>
> -		if (strncmp(ce->name, p->path, p->pathlen))
> -			return NULL;
> +		if (strncmp(ce->name, full_path.buf, full_path.len)) {
> +			ce = NULL;
> +			break;
> +		}
>
>  		if (S_ISSPARSEDIR(ce->ce_mode) &&
>  		    sparse_dir_matches_path(ce, info, p))
> -			return ce;
> +			break;
>
>  		pos--;
>  	}
>
> -	return NULL;
> +	strbuf_release(&full_path);
> +	return ce;
>  }
>
>  static void debug_path(struct traverse_info *info)
> --
> gitgitgadget
>
>




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux