Shuqi Liang wrote: > Before this patch, git check-attr was unable to read the attributes from > a .gitattributes file within a sparse directory. The original comment > was operating under the assumption that users are only interested in > files or directories inside the cones. Therefore, in the original code, > in the case of a cone-mode sparse-checkout, we didn't load the > .gitattributes file. > > However, this behavior can lead to missing attributes for files inside > sparse directories, causing inconsistencies in file handling. > > To resolve this, revise 'git check-attr' to allow attribute reading for > files in sparse directories from the corresponding .gitattributes files: > > 1.Utilize path_in_cone_mode_sparse_checkout() and index_name_pos_sparse > to check if a path falls within a sparse directory. > > 2.If path is inside a sparse directory, employ the value of > index_name_pos_sparse() to find the sparse directory containing path and > path relative to sparse directory. Proceed to read attributes from the > tree OID of the sparse directory using read_attr_from_blob(). > > 3.If path is not inside a sparse directory,ensure that attributes are > fetched from the index blob with read_blob_data_from_index(). > > Modify previous tests so such difference is not considered as an error. I don't quite follow what you mean here "such difference". I see that you changed the 'test_expect_failure' to 'test_expect_success', but that's because the attributes inside a sparse directory are now being read rather than ignored. The rest of the commit message looks good to me, though. > > Helped-by: Victoria Dye <vdye@xxxxxxxxxx> > Signed-off-by: Shuqi Liang <cheskaqiqi@xxxxxxxxx> > --- > attr.c | 60 +++++++++++++++++------- > t/t1092-sparse-checkout-compatibility.sh | 4 +- > 2 files changed, 44 insertions(+), 20 deletions(-) > > diff --git a/attr.c b/attr.c > index 7d39ac4a29..7650f5481a 100644 > --- a/attr.c > +++ b/attr.c > @@ -808,35 +808,59 @@ static struct attr_stack *read_attr_from_blob(struct index_state *istate, > static struct attr_stack *read_attr_from_index(struct index_state *istate, > const char *path, unsigned flags) > { > + struct attr_stack *stack = NULL; > char *buf; > unsigned long size; > + int pos = -1; > + char normalize_path[PATH_MAX]; > + const char *relative_path; > > if (!istate) > return NULL; > > /* > - * The .gitattributes file only applies to files within its > - * parent directory. In the case of cone-mode sparse-checkout, > - * the .gitattributes file is sparse if and only if all paths > - * within that directory are also sparse. Thus, don't load the > - * .gitattributes file since it will not matter. > - * > - * In the case of a sparse index, it is critical that we don't go > - * looking for a .gitattributes file, as doing so would cause the > - * index to expand. > + * When handling sparse-checkouts, .gitattributes files > + * may reside within a sparse directory. We distinguish > + * whether a path exists directly in the index or not by > + * evaluating if 'pos' is negative. > + * If 'pos' is negative, the path is not directly present > + * in the index and is likely within a sparse directory. > + * For paths not in the index, The absolute value of 'pos' > + * minus 1 gives us the position where the path would be > + * inserted in lexicographic order within the index. > + * We then subtract another 1 from this value > + * (pos = -pos - 2) to find the position of the last > + * index entry which is lexicographically smaller than > + * the path. This would be the sparse directory containing > + * the path. By identifying the sparse directory containing > + * the path, we can correctly read the attributes specified > + * in the .gitattributes file from the tree object of the > + * sparse directory. > */ > - if (!path_in_cone_mode_sparse_checkout(path, istate)) > - return NULL; > + if (!path_in_cone_mode_sparse_checkout(path, istate)) { > + pos = index_name_pos_sparse(istate, path, strlen(path)); > > - buf = read_blob_data_from_index(istate, path, &size); > - if (!buf) > - return NULL; > - if (size >= ATTR_MAX_FILE_SIZE) { > - warning(_("ignoring overly large gitattributes blob '%s'"), path); > - return NULL; > + if (pos < 0) > + pos = -pos - 2; > } What 'pos' represents after this block is somewhat confusing/possibly misleading. Consider the possible cases for 'path': 1. 'path' is inside the sparse-checkout cone. 2. 'path' is not inside the sparse-checkout cone, but it is in the index (i.e., the sparse directory has been expanded for some reason). 3. 'path' is inside a sparse directory. In case #1, 'pos' will be '-1'. We never enter the '!path_in_cone_mode_sparse_checkout()' if-statement, so the value is never updated. In case #2, 'pos' will be positive, since it exists in the index. In case #3, the return value of 'index_name_pos_sparse()' will be negative, then assigned the value of '-pos - 2' to (in many cases) create a positive value pointing to the entry before the insertion point of 'path'. Based on your explanation in the code comment above, I would assume that, if 'pos' was >= 0 coming out of this block, it represented the index position of the potential sparse directory containing 'path'. But that's not true in case #2. To make the purpose of these variables clearer, instead of changing 'pos' in-place, you could create a variable like 'sparse_dir_pos'. Like 'pos' is now, it'd be initialized to '-1', but instead of unconditionally assigning it the output of 'index_name_pos_sparse()', you'd only assign it if the output of that function is negative: if (!path_in_cone_mode_sparse_checkout(path, istate)) { int pos = index_name_pos_sparse(istate, path, strlen(path)); if (pos < 0) sparse_dir_pos = -pos - 2; } Then, you'd use 'sparse_dir_pos' in the condition below, and it'd always be -1 when 'path' is definitely not contained in a sparse directory. > > - return read_attr_from_buf(buf, path, flags); > + if (pos >= 0 && !path_in_cone_mode_sparse_checkout(path, istate) && nit: the check of '!path_in_cone_mode_sparse_checkout()' is redundant, since 'pos' will always be '-1' if 'path_in_cone_mode_sparse_checkout()' is true. > + S_ISSPARSEDIR(istate->cache[pos]->ce_mode) && > + !strncmp(istate->cache[pos]->name, path, ce_namelen(istate->cache[pos])) && > + !normalize_path_copy(normalize_path, path)) { The normalization should come before the 'strncmp' so that you're comparing 'normalize_path' to the index entry. That said, I know I asked about path normalization earlier [1], but did you confirm that 'path' _isn't_ normalized? Because if it's normalized by all of the potential callers of this function, the check here would be unnecessary. [1] https://lore.kernel.org/git/e4a77d0f-cf1d-ef76-fe26-ad5e58372a02@xxxxxxxxxx/ > + relative_path = normalize_path + ce_namelen(istate->cache[pos]); 'relative_path' should be declared here, since it's not used outside this block. > + stack = read_attr_from_blob(istate, &istate->cache[pos]->oid, relative_path, flags); > + } else { > + buf = read_blob_data_from_index(istate, path, &size); > + if (!buf) > + return NULL; > + if (size >= ATTR_MAX_FILE_SIZE) { > + warning(_("ignoring overly large gitattributes blob '%s'"), path); > + return NULL; > + } > + stack = read_attr_from_buf(buf, path, flags); > + } > + return stack; > } > > static struct attr_stack *read_attr(struct index_state *istate, > diff --git a/t/t1092-sparse-checkout-compatibility.sh b/t/t1092-sparse-checkout-compatibility.sh > index 90633f383a..3f32c1f972 100755 > --- a/t/t1092-sparse-checkout-compatibility.sh > +++ b/t/t1092-sparse-checkout-compatibility.sh > @@ -2271,7 +2271,7 @@ test_expect_success 'check-attr with pathspec inside sparse definition' ' > test_all_match git check-attr -a --cached -- deep/a > ' > > -test_expect_failure 'check-attr with pathspec outside sparse definition' ' > +test_expect_success 'check-attr with pathspec outside sparse definition' ' Re: my suggested change to the test in patch 1 [2], when I applied _this_ patch, the test still failed for the 'sparse-index' case. It doesn't seem to be a problem with your patch, but rather a bug in 'diff' that can be reproduced with this test (using the infrastructure in t1092): test_expect_failure 'diff --cached shows differences in sparse directory' ' init_repos && test_all_match git reset --soft update-folder1 && test_all_match git diff --cached -- folder1/a ' It's not immediately obvious to me what the problem is, but my guess is it's some mis-handling of sparse directories in the internal diff machinery. Given the likely complexity of the issue, I'd be content with you leaving the 'diff --check' test as 'test_expect_failure' with a note about the bug in 'diff' to fix later. Or, if you do want to investigate & fix it now, I wouldn't be opposed to that either. :) [2] https://lore.kernel.org/git/c3ebe3b4-88b9-8ca2-2ee3-39a3e0d82201@xxxxxxxxxx/ > init_repos && > > echo "a -crlf myAttr" >>.gitattributes && > @@ -2288,7 +2288,7 @@ test_expect_failure 'check-attr with pathspec outside sparse definition' ' > test_all_match git check-attr -a --cached -- folder1/a > ' > > -test_expect_failure 'diff --check with pathspec outside sparse definition' ' > +test_expect_success 'diff --check with pathspec outside sparse definition' ' > init_repos && > > write_script edit-contents <<-\EOF &&