Re: [PATCH v2 07/10] sparse-index: partially expand directories

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 5/20/2022 2:17 PM, Junio C Hamano wrote:> "Derrick Stolee via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes:
> 
>> @@ -231,18 +236,41 @@ static int add_path_to_index(const struct object_id *oid,
>>  			     struct strbuf *base, const char *path,
>>  			     unsigned int mode, void *context)
>>  {
> 
> OK, this function is a callback of read_tree_at(), whose caller is
> walking the source (=current) index, and while it is copying
> existing non-directory entries to the destination (=ctx->write)
> index, it found a tree in the source in-core index that does not
> match the cone pattern (or we are expanding fully).  Our job is to
> expand the given tree as a "subdirectory" in the project into the
> destination index.
> 
> We used to just let the calling read_tree_at() fully and recursively
> walk the tree while adding any non-tree entries to the destination.
> Now, we have the pattern list in the context, we go more selective.
> 
> The design makes sense.
> 
>> +	struct modify_index_context *ctx = (struct modify_index_context *)context;
>>  	struct cache_entry *ce;
>>  	size_t len = base->len;
>>  
>> +	if (S_ISDIR(mode)) {
>> +		int dtype;
>> +		size_t baselen = base->len;
>> +		if (!ctx->pl)
>> +			return READ_TREE_RECURSIVE;
> 
> Fully recursive case.  We can do without any match, just like the
> caller (i.e. expand_to_pattern_list) did, when the pattern list is
> NULL.
> 
>> +		/*
>> +		 * Have we expanded to a point outside of the sparse-checkout?
>> +		 */
>> +		strbuf_addstr(base, path);
>> +		strbuf_add(base, "/-", 2);
>> +
>> +		if (path_matches_pattern_list(base->buf, base->len,
>> +					      NULL, &dtype,
>> +					      ctx->pl, ctx->write)) {
> 
> If that sample path in the directory matches (MATCHED or
> MATCHED_RECURSIVE) the patterns, we recurse into the tree to expand.
> 
> Can the function return UNDECIDED here?  If so what should happen?
> As written, the code will behave as if it matched, and it is not
> quite clear if that is the behaviour intended by this patch or an
> accident.

UNDECIDED can only be returned when not in cone mode. We are in cone mode
when this helper is used.

>> +			strbuf_setlen(base, baselen);
>> +			return READ_TREE_RECURSIVE;
>> +		}
> 
> The caller found a tree at path <path> in the index.  We check if
> our patterns match a fictitious path <path> + "/-", which may exist
> if the <path> is a directory and if there is a funny named file or
> directory "-" in it.
> 
> Why "-"?  Are we trying ot see if "ctx->pl" matches "anything" in
> the directory that is at <path>?  Is the assumption here that pl
> only names directories literally without blobs (I presume that is
> the same thing as assuming the cone mode)?
> 
> I am trying to see if there is a way that expresses the intention of
> this code more clearly than using an arbitrary path "-" (and trying
> to figure out if I got the intention right in the first place ;-)..

The '+ "/-"' is there to give us an example file path for "<path>" as a
directory. This is specifically because path_matches_pattern_list() checks
the parent directories for matches, expecting the input to be a file name.

This could use a comment. I'm thinking...

/*
 * Have we expanded to a point outside of the sparse-checkout?
 *
 * Artificially pad the path name with a slash "/" to
 * indicate it as a directory, and add an arbitrary file
 * name ("-") so we can consider base->buf as a file name
 * to match against the cone-mode patterns.
 *
 * If we compared just "path", then we would expand more
 * than we should. Since every file at root is always
 * included, we would expand every directory at root at
 * least one level deep instead of using sparse directory
 * entries.
 */

>> +		/*
>> +		 * The path "{base}{path}/" is a sparse directory. Create the correct
>> +		 * name for inserting the entry into the index.
>> +		 */
>> +		strbuf_setlen(base, base->len - 1);
> 
> This removes that phoney "-" while keeping the trailing "/".  Just
> like "-" was unclear, understanding this "- 1" requires that the
> reader understands why "/-" was used earlier.
> 
> The resulting "base" is used in the newly created cache entry to
> represent the (unexpanded) directory below, and such a cache entry
> is supposed to have a path with a trailing slash, so it makes sense.
> 
>> +	} else {
>> +		strbuf_addstr(base, path);
> 
> For any non-tree thing, we use the given path for the cache entry to
> represent it.
> 
>> +	}
>> +
>> +	ce = make_cache_entry(ctx->write, mode, oid, base->buf, 0, 0);
>>  	ce->ce_flags |= CE_SKIP_WORKTREE | CE_EXTENDED;
>> -	set_index_entry(istate, istate->cache_nr++, ce);
>> +	set_index_entry(ctx->write, ctx->write->cache_nr++, ce);
> 
> And the cache entry (newly created) is added to the destination.
> Unlike what happens in the caller (i.e. expand_to_pattern_list) that
> moves the cache entry taken from the source index to the destination
> index, the caller will discard the cache entry taken from the source
> index after read_tree_at() returns, and we create a new instance for
> ourselves here, even if we _could_ have reused it (by passing it in
> the context structure, for example).
> 
>>  	strbuf_setlen(base, len);
> 
> And we restore the length of the path in the base to the original
> before we return.
> 
>>  	return 0;
> 
> Returning 0 as opposed to READ_TREE_RECURSIVE here means "we've
> dealt with the entry; don't recurse into subtree even if you called
> us with a tree", which is the right thing to do here, as we did have
> done all we need to here.
> 
> OK, except for that "/-" thing, all of the above makes sense to me.

Thanks for your careful read. Hopefully my explanation of the "/-" thing
helps.

Thanks,
-Stolee



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux