Victoria Dye wrote: > Shaoxuan Yuan wrote: >> Using check_dir_in_index without checking if the directory is on-disk >> could get a false positive for partially sparsified directory. >> >> Add a note in the documentation to warn potential user. >> >> Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@xxxxxxxxx> >> --- >> builtin/mv.c | 5 +++++ >> 1 file changed, 5 insertions(+) >> >> diff --git a/builtin/mv.c b/builtin/mv.c >> index 4729bb1a1a..c8b9069db8 100644 >> --- a/builtin/mv.c >> +++ b/builtin/mv.c >> @@ -132,6 +132,11 @@ static int index_range_of_same_dir(const char *src, int length, >> * Return 0 if such directory exist (i.e. with any of its contained files not >> * marked with CE_SKIP_WORKTREE, the directory would be present in working tree). >> * Return 1 otherwise. >> + * >> + * Note: *always* check the directory is not on-disk before this function >> + * (i.e. using lstat()); >> + * otherwise it may return a false positive for a partially sparsified >> + * directory. > > To me, a comment like this indicates that either the function is not doing > what it should be doing, or its name doesn't properly describe the > function's behavior. > > Per the function description: > > Check if an out-of-cone directory should be in the index. Imagine > this case that all the files under a directory are marked with > 'CE_SKIP_WORKTREE' bit and thus the directory is sparsified. > > But neither the name of the function ('check_dir_in_index') nor the > function's behavior (hence the comment you're adding here) match that > description. > > Since this function is intended to verify that a directory 1) exists in the > index, and 2) is *entirely* sparse, I have two suggestions: > > * Change the description to specify that the non-existence of the directory > on disk is an *assumption*, not an opportunity for a "false positive." > It's a subtle distinction, but it clarifies that the function should be > used only when the caller already knows the directory is empty. For > example: > > /* > * Given the path of a directory that does not exist on-disk, check whether the > * directory contains any entries in the index with the SKIP_WORKTREE flag > * enabled. > * > * Return 1 if such index entries exist. > * Return 0 otherwise. > */ Whoops, I mixed up the return values (I assumed this returned a boolean based on the 'check' in the function name). In that case... > > * 'check_dir_in_index' doesn't reflect the "is not on disk" prerequisite, so > the function name can be updated to clarify that (e.g., > 'empty_dir_has_sparse_contents') ...there are two options. Either you can use a more "boolean-y" name (like the one I suggest above) and flip the return values (where "1" means "the empty dir has sparse contents"), or change the name to something that explicitly *doesn't* imply boolean. I'm personally in favor of the former (I'm really struggling to come up with a descriptive, non-boolean name for this function), but I'm fine with either if you can come up with a good function name. > >> */ >> static int check_dir_in_index(const char *name) >> { >