Elijah Newren wrote: > Hi Victoria, > > On Fri, Feb 25, 2022 at 12:25 PM Victoria Dye <vdye@xxxxxxxxxx> wrote: >> >> Elijah Newren wrote: >>> On Thu, Feb 24, 2022 at 2:34 PM Victoria Dye via GitGitGadget >>> <gitgitgadget@xxxxxxxxx> wrote: >>>> >>>> From: Victoria Dye <vdye@xxxxxxxxxx> >>>> >>>> When 'git read-tree' is provided with a prefix, expand the index only if the >>>> prefix is equivalent to a sparse directory or contained within one. If the >>>> index is not expanded in these cases, 'ce_in_traverse_path' will indicate >>>> that the relevant sparse directory is not in the prefix/traverse path, >>>> skipping past it and not unpacking the appropriate tree(s). >>>> >>>> If the prefix is in-cone, its sparse subdirectories (if any) will be >>>> traversed correctly without index expansion. >>>> >>>> Signed-off-by: Victoria Dye <vdye@xxxxxxxxxx> >>>> --- >>>> builtin/read-tree.c | 3 +-- >>>> t/t1092-sparse-checkout-compatibility.sh | 8 ++++++- >>>> unpack-trees.c | 30 ++++++++++++++++++++++++ >>>> 3 files changed, 38 insertions(+), 3 deletions(-) >>>> >>>> diff --git a/builtin/read-tree.c b/builtin/read-tree.c >>>> index c2fdbc2657f..a7b7f822281 100644 >>>> --- a/builtin/read-tree.c >>>> +++ b/builtin/read-tree.c >>>> @@ -213,8 +213,7 @@ int cmd_read_tree(int argc, const char **argv, const char *cmd_prefix) >>>> if (opts.merge && !opts.index_only) >>>> setup_work_tree(); >>>> >>>> - /* TODO: audit sparse index behavior in unpack_trees */ >>>> - if (opts.skip_sparse_checkout || opts.prefix) >>>> + if (opts.skip_sparse_checkout) >>>> ensure_full_index(&the_index); >>>> >>>> if (opts.merge) { >>>> diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh >>>> index ae44451a0a9..a404be0a10f 100755 >>>> --- a/t/t1092-sparse-checkout-compatibility.sh >>>> +++ b/t/t1092-sparse-checkout-compatibility.sh >>>> @@ -1415,7 +1415,13 @@ test_expect_success 'sparse index is not expanded: read-tree' ' >>>> do >>>> ensure_not_expanded read-tree -mu $MERGE_TREES && >>>> ensure_not_expanded reset --hard HEAD || return 1 >>>> - done >>>> + done && >>>> + >>>> + rm -rf sparse-index/deep/deeper2 && >>>> + ensure_not_expanded add . && >>>> + ensure_not_expanded commit -m "test" && >>>> + >>>> + ensure_not_expanded read-tree --prefix=deep/deeper2 -u deepest >>>> ' >>>> >>>> test_expect_success 'ls-files' ' >>>> diff --git a/unpack-trees.c b/unpack-trees.c >>>> index 360844bda3a..dba122a02bb 100644 >>>> --- a/unpack-trees.c >>>> +++ b/unpack-trees.c >>>> @@ -1739,6 +1739,36 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options >>>> setup_standard_excludes(o->dir); >>>> } >>>> >>>> + /* >>>> + * If the prefix is equal to or contained within a sparse directory, the >>>> + * index needs to be expanded to traverse with the specified prefix. >>>> + */ >>>> + if (o->prefix && o->src_index->sparse_index) { >>>> + int prefix_len = strlen(o->prefix); >>>> + >>>> + while (prefix_len > 0 && o->prefix[prefix_len - 1] == '/') >>>> + prefix_len--; >>>> + >>>> + if (prefix_len > 0) { >>> >>> Is this condition check necessary? If we want some safety check here, >>> could it instead be something like >>> >>> if (prefix_len <= 0) >>> BUG("Broken prefix passed to unpack_trees"); >>> >> >> This condition was intended to skip unnecessary computation for the >> (probably unlikely, but still technically valid) case where the prefix is >> the repo root (e.g., '--prefix=/') - because the repo root is represented >> with only directory separator(s), `prefix_len` would be 0 after removing >> trailing '/'. In that scenario, the index won't need to be expanded, so we >> don't need to go looking in the index for that path. > > Wow, that doesn't result in an error?!? That surprises me. I never > even thought to test such a thing. Clearly, the following related > command does give an error: > > git read-tree --prefix=/toplevel/ $TREE > > (namely, "error: invalid path '/toplevel/subpath'") > > whereas, the reason a single slash is accepted it because git is > trying to be forgiving and treat the following two commands the same: > > git read-tree --prefix=subdir/ $TREE > git read-tree --prefix=subdir $TREE > > i.e. it tries to allow the trailing slash to be optional. And, by > implementation quirk, making a trailing slash be optional turns out to > mean that --prefix=/ is treated the same as no prefix at all, because > empty string prefix just happens to give the same behavior as NULL > prefix. > If by "NULL prefix" you mean "the --prefix option isn't specified", the behavior isn't *quite* the same as with an empty prefix, since `git read-tree $TREE` will completely overwrite the index without regard for what's already there, whereas `git read-tree --prefix= $TREE` "merges" `$TREE` into the current index (failing if any entries in `$TREE` already exist in the index). I still think it's an extremely niche usage, but can also imagine a couple instances where it might be useful (e.g., reading from an orphan branch). > I think we should just throw an error if prefix starts with '/'. > unpack_trees() can make it be a BUG() (and at the beginning of the > function rather than down at this point and only inside some > conditional). builtin/read-tree.c, the only thing that ever sets > prefix in unpack_trees_options, should die() if it's passed something > that starts with a '/'. Having paths start with a '/' is antithetical > to how "prefix" is used throughout the codebase, though I guess I can > see people making that mistake if they are used to gitignore-style > patterns instead. > I think this makes sense (especially because it doesn't cause a loss of functionality - a user can still specify '--prefix=' for the repo root if they really want to). I'll include a patch for it in my next re-roll. >> None of that is particularly clear from reading the patch, though, so I'll >> add a comment & test covering it explicitly. >> >>> and then dedent the following code? (Or are callers allowed to not >>> sanitize their input before passing to unpack_trees(), meaning that we >>> should use a die() rather than a BUG()?) >>> >>> To test this idea, near the top of unpack_trees(), I added: >>> if (o->prefix) >>> assert(*o->prefix && *o->prefix != '/'); >>> and reran all tests. They all ran without hitting that assertion. FWIW. >>> >>>> + struct strbuf ce_prefix = STRBUF_INIT; >>>> + strbuf_grow(&ce_prefix, prefix_len + 1); >>>> + strbuf_add(&ce_prefix, o->prefix, prefix_len); >>>> + strbuf_addch(&ce_prefix, '/'); >>>> + >>>> + /* >>>> + * If the prefix is not inside the sparse cone, then the >>>> + * index is explicitly expanded if it is found as a sparse >>>> + * directory, or implicitly expanded (by 'index_name_pos') >>>> + * if the path is inside a sparse directory. >>>> + */ >>>> + if (!path_in_cone_mode_sparse_checkout(ce_prefix.buf, o->src_index) && >>>> + index_name_pos(o->src_index, ce_prefix.buf, ce_prefix.len) >= 0) >>> >>> style nit: Can you rewrap both the comments and the code at 80 characters? >>> >> >> I couldn't think of a way to wrap the condition that wouldn't make it more >> difficult to read. The best I could come up with was: >> >> if (!path_in_cone_mode_sparse_checkout(ce_prefix.buf, >> o->src_index) && >> index_name_pos(o->src_index, >> ce_prefix.buf, >> ce_prefix.len) >= 0) >> ensure_full_index(o->src_index); >> >> >> which, to me, is a bit hard to parse. Alternatively, though, I can move the >> prefix-checking logic into its own function (kind of like >> 'pathspec_needs_expanded_index(...)' in [1]), in which case I won't need to >> change the current wrapping to keep it under 80 characters. >> >> [1] https://lore.kernel.org/git/822d7344587f698e73abba1ca726c3a905f7b403.1638201164.git.gitgitgadget@xxxxxxxxx/ >> >>> It took me a bit of playing and testing to understand these two lines. >>> The comment helps, but it's still a bit dense to unpack; somehow I >>> didn't understand that the comment was referring to index_name_pos()'s >>> call to ensure_full_index(). Once I understood that, it all looks >>> good. >>> >> >> Sorry about that, I'll revise to make that clearer. > > Thanks. :-) > >>> >>>> + ensure_full_index(o->src_index); >>>> + >>>> + strbuf_release(&ce_prefix); >>>> + } >>>> + } >>>> + >>>> if (!core_apply_sparse_checkout || !o->update) >>>> o->skip_sparse_checkout = 1; >>>> if (!o->skip_sparse_checkout && !o->pl) { >>>> -- >>>> gitgitgadget