Re: [PATCH v6 0/8] Sparse Index: integrate with reset

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Nov 29 2021, Victoria Dye wrote:

> Elijah Newren wrote:
>> Hi,
>> 
>> On Mon, Nov 29, 2021 at 7:52 AM Victoria Dye via GitGitGadget
>> <gitgitgadget@xxxxxxxxx> wrote:
>>>
>>> Changes since V5
>>> ================
>>>
>>>  * Update t1092 test 'reset with wildcard pathspec' with more cases and
>>>    better checks
>>>  * Add "special case" wildcard pathspec check when determining whether to
>>>    expand the index (avoids double-loop over pathspecs & index entries)
>> 
>> Looks pretty good.  However, I'm worried this special case you added
>> at my prodding might be problematic, and that I may have been wrong to
>> prod you into it...
>> 
>>> Thanks! -Victoria
>>>
>>> Range-diff vs v5:
>>>
>>>  7:  a9135a5ed64 ! 7:  822d7344587 reset: make --mixed sparse-aware
>>>      @@ Commit message
>>>
>>>           Remove the `ensure_full_index` guard on `read_from_tree` and update `git
>>>           reset --mixed` to ensure it can use sparse directory index entries wherever
>>>      -    possible. Sparse directory entries are reset use `diff_tree_oid`, which
>>>      +    possible. Sparse directory entries are reset using `diff_tree_oid`, which
>>>           requires `change` and `add_remove` functions to process the internal
>>>           contents of the sparse directory. The `recursive` diff option handles cases
>>>           in which `reset --mixed` must diff/merge files that are nested multiple
>>>      @@ builtin/reset.c: static void update_index_from_diff(struct diff_queue_struct *q,
>>>       +          * (since we can reset whole sparse directories without expanding them).
>>>       +          */
>>>       +         if (item.nowildcard_len < item.len) {
>>>      ++                 /*
>>>      ++                  * Special case: if the pattern is a path inside the cone
>>>      ++                  * followed by only wildcards, the pattern cannot match
>>>      ++                  * partial sparse directories, so we don't expand the index.
>>>      ++                  */
>>>      ++                 if (path_in_cone_mode_sparse_checkout(item.original, &the_index) &&
>>>      ++                     strspn(item.original + item.nowildcard_len, "*") == item.len - item.nowildcard_len)
>> 
>> I usually expect in an &&-chain to see the cheaper function call first
>> (because that ordering often avoids the need to call the second
>> function), and I would presume that strspn() would be the cheaper of
>> the two.  Did you switch the order because you expect the strspn call
>> to nearly always return true, though?
>> 
>
> This is a miss on my part, the `strspn()` check is probably less expensive
> and should be first.

I doubt it matters either way, and I didn't look into this to any degree
of carefulness.

But having followed the breadcrumb trail from the "What's Cooking"
discussion & looked at the code one thing that stuck out for me was that
path_in_cone_mode_sparse_checkout() appears returns 1 inconditionally in
some cases based on global state:

	/*
	 * We default to accepting a path if there are no patterns or
	 * they are of the wrong type.
	 */
	if (init_sparse_checkout_patterns(istate) ||
	    (require_cone_mode &&
	     !istate->sparse_checkout_patterns->use_cone_patterns))
		return 1;

So moreso than the nano-optimization of strspn()
v.s. path_in_cone_mode_sparse_checkout() I found it a bit odd that we're
calling something in a loop where presumably we can punt out a lot
earlier, and at least make that "continue" a "break" or "return" in that
case.

I.e. something in this direction (this patch obviously doesn't even
compile, but should clarify what I'm blathering about :); but again, I
really haven't looked at this properly, so just food for thought:

diff --git a/builtin/reset.c b/builtin/reset.c
index b1ff699b43a..cefdabb09c2 100644
--- a/builtin/reset.c
+++ b/builtin/reset.c
@@ -187,6 +187,9 @@ static int pathspec_needs_expanded_index(const struct pathspec *pathspec)
 	if (pathspec->magic)
 		return 1;
 
+	if (cant_possibly_have_path_in_cone_mode_blah_blah(&the_index))
+		return 1;
+
 	for (i = 0; i < pathspec->nr; i++) {
 		struct pathspec_item item = pathspec->items[i];
 
diff --git a/dir.c b/dir.c
index 5aa6fbad0b7..19f2d989dd3 100644
--- a/dir.c
+++ b/dir.c
@@ -1456,14 +1456,8 @@ int init_sparse_checkout_patterns(struct index_state *istate)
 	return 0;
 }
 
-static int path_in_sparse_checkout_1(const char *path,
-				     struct index_state *istate,
-				     int require_cone_mode)
+int cant_possibly_have_path_in_cone_mode_blah_blah(...)
 {
-	int dtype = DT_REG;
-	enum pattern_match_result match = UNDECIDED;
-	const char *end, *slash;
-
 	/*
 	 * We default to accepting a path if there are no patterns or
 	 * they are of the wrong type.
@@ -1472,6 +1466,16 @@ static int path_in_sparse_checkout_1(const char *path,
 	    (require_cone_mode &&
 	     !istate->sparse_checkout_patterns->use_cone_patterns))
 		return 1;
+}
+
+
+static int path_in_sparse_checkout_1(const char *path,
+				     struct index_state *istate,
+				     int require_cone_mode)
+{
+	int dtype = DT_REG;
+	enum pattern_match_result match = UNDECIDED;
+	const char *end, *slash;
 
 	/*
 	 * If UNDECIDED, use the match from the parent dir (recursively), or



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux