Hi Victoria, On Fri, Feb 25, 2022 at 12:25 PM Victoria Dye <vdye@xxxxxxxxxx> wrote: > > Elijah Newren wrote: > > On Thu, Feb 24, 2022 at 2:34 PM Victoria Dye via GitGitGadget > > <gitgitgadget@xxxxxxxxx> wrote: > >> > >> From: Victoria Dye <vdye@xxxxxxxxxx> > >> > >> When 'git read-tree' is provided with a prefix, expand the index only if the > >> prefix is equivalent to a sparse directory or contained within one. If the > >> index is not expanded in these cases, 'ce_in_traverse_path' will indicate > >> that the relevant sparse directory is not in the prefix/traverse path, > >> skipping past it and not unpacking the appropriate tree(s). > >> > >> If the prefix is in-cone, its sparse subdirectories (if any) will be > >> traversed correctly without index expansion. > >> > >> Signed-off-by: Victoria Dye <vdye@xxxxxxxxxx> > >> --- > >> builtin/read-tree.c | 3 +-- > >> t/t1092-sparse-checkout-compatibility.sh | 8 ++++++- > >> unpack-trees.c | 30 ++++++++++++++++++++++++ > >> 3 files changed, 38 insertions(+), 3 deletions(-) > >> > >> diff --git a/builtin/read-tree.c b/builtin/read-tree.c > >> index c2fdbc2657f..a7b7f822281 100644 > >> --- a/builtin/read-tree.c > >> +++ b/builtin/read-tree.c > >> @@ -213,8 +213,7 @@ int cmd_read_tree(int argc, const char **argv, const char *cmd_prefix) > >> if (opts.merge && !opts.index_only) > >> setup_work_tree(); > >> > >> - /* TODO: audit sparse index behavior in unpack_trees */ > >> - if (opts.skip_sparse_checkout || opts.prefix) > >> + if (opts.skip_sparse_checkout) > >> ensure_full_index(&the_index); > >> > >> if (opts.merge) { > >> diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh > >> index ae44451a0a9..a404be0a10f 100755 > >> --- a/t/t1092-sparse-checkout-compatibility.sh > >> +++ b/t/t1092-sparse-checkout-compatibility.sh > >> @@ -1415,7 +1415,13 @@ test_expect_success 'sparse index is not expanded: read-tree' ' > >> do > >> ensure_not_expanded read-tree -mu $MERGE_TREES && > >> ensure_not_expanded reset --hard HEAD || return 1 > >> - done > >> + done && > >> + > >> + rm -rf sparse-index/deep/deeper2 && > >> + ensure_not_expanded add . && > >> + ensure_not_expanded commit -m "test" && > >> + > >> + ensure_not_expanded read-tree --prefix=deep/deeper2 -u deepest > >> ' > >> > >> test_expect_success 'ls-files' ' > >> diff --git a/unpack-trees.c b/unpack-trees.c > >> index 360844bda3a..dba122a02bb 100644 > >> --- a/unpack-trees.c > >> +++ b/unpack-trees.c > >> @@ -1739,6 +1739,36 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options > >> setup_standard_excludes(o->dir); > >> } > >> > >> + /* > >> + * If the prefix is equal to or contained within a sparse directory, the > >> + * index needs to be expanded to traverse with the specified prefix. > >> + */ > >> + if (o->prefix && o->src_index->sparse_index) { > >> + int prefix_len = strlen(o->prefix); > >> + > >> + while (prefix_len > 0 && o->prefix[prefix_len - 1] == '/') > >> + prefix_len--; > >> + > >> + if (prefix_len > 0) { > > > > Is this condition check necessary? If we want some safety check here, > > could it instead be something like > > > > if (prefix_len <= 0) > > BUG("Broken prefix passed to unpack_trees"); > > > > This condition was intended to skip unnecessary computation for the > (probably unlikely, but still technically valid) case where the prefix is > the repo root (e.g., '--prefix=/') - because the repo root is represented > with only directory separator(s), `prefix_len` would be 0 after removing > trailing '/'. In that scenario, the index won't need to be expanded, so we > don't need to go looking in the index for that path. Wow, that doesn't result in an error?!? That surprises me. I never even thought to test such a thing. Clearly, the following related command does give an error: git read-tree --prefix=/toplevel/ $TREE (namely, "error: invalid path '/toplevel/subpath'") whereas, the reason a single slash is accepted it because git is trying to be forgiving and treat the following two commands the same: git read-tree --prefix=subdir/ $TREE git read-tree --prefix=subdir $TREE i.e. it tries to allow the trailing slash to be optional. And, by implementation quirk, making a trailing slash be optional turns out to mean that --prefix=/ is treated the same as no prefix at all, because empty string prefix just happens to give the same behavior as NULL prefix. I think we should just throw an error if prefix starts with '/'. unpack_trees() can make it be a BUG() (and at the beginning of the function rather than down at this point and only inside some conditional). builtin/read-tree.c, the only thing that ever sets prefix in unpack_trees_options, should die() if it's passed something that starts with a '/'. Having paths start with a '/' is antithetical to how "prefix" is used throughout the codebase, though I guess I can see people making that mistake if they are used to gitignore-style patterns instead. > None of that is particularly clear from reading the patch, though, so I'll > add a comment & test covering it explicitly. > > > and then dedent the following code? (Or are callers allowed to not > > sanitize their input before passing to unpack_trees(), meaning that we > > should use a die() rather than a BUG()?) > > > > To test this idea, near the top of unpack_trees(), I added: > > if (o->prefix) > > assert(*o->prefix && *o->prefix != '/'); > > and reran all tests. They all ran without hitting that assertion. FWIW. > > > >> + struct strbuf ce_prefix = STRBUF_INIT; > >> + strbuf_grow(&ce_prefix, prefix_len + 1); > >> + strbuf_add(&ce_prefix, o->prefix, prefix_len); > >> + strbuf_addch(&ce_prefix, '/'); > >> + > >> + /* > >> + * If the prefix is not inside the sparse cone, then the > >> + * index is explicitly expanded if it is found as a sparse > >> + * directory, or implicitly expanded (by 'index_name_pos') > >> + * if the path is inside a sparse directory. > >> + */ > >> + if (!path_in_cone_mode_sparse_checkout(ce_prefix.buf, o->src_index) && > >> + index_name_pos(o->src_index, ce_prefix.buf, ce_prefix.len) >= 0) > > > > style nit: Can you rewrap both the comments and the code at 80 characters? > > > > I couldn't think of a way to wrap the condition that wouldn't make it more > difficult to read. The best I could come up with was: > > if (!path_in_cone_mode_sparse_checkout(ce_prefix.buf, > o->src_index) && > index_name_pos(o->src_index, > ce_prefix.buf, > ce_prefix.len) >= 0) > ensure_full_index(o->src_index); > > > which, to me, is a bit hard to parse. Alternatively, though, I can move the > prefix-checking logic into its own function (kind of like > 'pathspec_needs_expanded_index(...)' in [1]), in which case I won't need to > change the current wrapping to keep it under 80 characters. > > [1] https://lore.kernel.org/git/822d7344587f698e73abba1ca726c3a905f7b403.1638201164.git.gitgitgadget@xxxxxxxxx/ > > > It took me a bit of playing and testing to understand these two lines. > > The comment helps, but it's still a bit dense to unpack; somehow I > > didn't understand that the comment was referring to index_name_pos()'s > > call to ensure_full_index(). Once I understood that, it all looks > > good. > > > > Sorry about that, I'll revise to make that clearer. Thanks. :-) > > > >> + ensure_full_index(o->src_index); > >> + > >> + strbuf_release(&ce_prefix); > >> + } > >> + } > >> + > >> if (!core_apply_sparse_checkout || !o->update) > >> o->skip_sparse_checkout = 1; > >> if (!o->skip_sparse_checkout && !o->pl) { > >> -- > >> gitgitgadget