On Wed, Feb 16, 2022 at 1:37 AM Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> wrote: > > On Fri, Jan 14 2022, Elijah Newren via GitGitGadget wrote: > > > From: Elijah Newren <newren@xxxxxxxxx> > > [...] > > +static int path_found(const char *path, const char **dirname, size_t *dir_len, > > + int *dir_found) > > +{ > > + struct stat st; > > + char *newdir; > > + char *tmp; > > + > > + /* > > + * If dirname corresponds to a directory that doesn't exist, and this > > + * path starts with dirname, then path can't exist. > > + */ > > + if (!*dir_found && !memcmp(path, *dirname, *dir_len)) > > + return 0; > > + > > + /* > > + * If path itself exists, return 1. > > + */ > > + if (!lstat(path, &st)) > > + return 1; > > + > > + /* > > + * Otherwise, path does not exist so we'll return 0...but we'll first > > + * determine some info about its parent directory so we can avoid > > + * lstat calls for future cache entries. > > + */ > > + newdir = strrchr(path, '/'); > > + if (!newdir) > > + return 0; /* Didn't find a parent dir; just return 0 now. */ > > + > > + /* > > + * If path starts with directory (which we already lstat'ed and found), > > + * then no need to lstat parent directory again. > > + */ > > + if (*dir_found && *dirname && memcmp(path, *dirname, *dir_len)) > > + return 0; > > I really don't care/just asking, but there was a discussion on another > topic about guarding calls to the mem*() family when n=0: > https://lore.kernel.org/git/xmqq1r24gsph.fsf@gitster.g/ > > Is this the same sort of redundancy where we could lose the "&& > *dirname" part, or is it still important because a "\0" dirname would > have corresponding non-0 *dir_len? No, dirname is a char**, not a char*. I need to make sure *dirname is non-NULL before passing to memcmp or we get segfaults (and *dirname will be NULL the first time it gets to this line, so the check is critical). > More generally ... (see below)... > > > + > > + /* Free previous dirname, and cache path's dirname */ > > + *dirname = path; > > + *dir_len = newdir - path + 1; > > + > > + tmp = xstrndup(path, *dir_len); > > + *dir_found = !lstat(tmp, &st); > > In most other places we're a bit more careful about lstat() error handling, e.g.: > > builtin/init-db.c: if (lstat(path->buf, &st_git)) { > builtin/init-db.c- if (errno != ENOENT) > builtin/init-db.c- die_errno(_("cannot stat '%s'"), path->buf); > builtin/init-db.c- } > > Shouldn't we do the same here and at least error() on return values of > -1 with an accompanying errno that isn't ENOENT? If we should do that everywhere, should we have an xlstat in wrapper.[ch]? > > + free(tmp); > > + > > + return 0; > > +} > > + > > void clear_skip_worktree_from_present_files(struct index_state *istate) > > { > > + const char *last_dirname = NULL; > > + size_t dir_len = 0; > > + int dir_found = 1; > > + > > int i; > > + > > if (!core_apply_sparse_checkout) > > return; > > > > restart: > > for (i = 0; i < istate->cache_nr; i++) { > > struct cache_entry *ce = istate->cache[i]; > > - struct stat st; > > > > - if (ce_skip_worktree(ce) && !lstat(ce->name, &st)) { > > + if (ce_skip_worktree(ce) && > > + path_found(ce->name, &last_dirname, &dir_len, &dir_found)) { > > ...(continued from above) is the "path is zero" part of this even > reachable? I tried with this on top and ran your tests (and the rest of > t*sparse*.sh) successfully: > > diff --git a/sparse-index.c b/sparse-index.c > index eed170cd8f7..f89c944d8cd 100644 > --- a/sparse-index.c > +++ b/sparse-index.c > @@ -403,6 +403,7 @@ void clear_skip_worktree_from_present_files(struct index_state *istate) > for (i = 0; i < istate->cache_nr; i++) { > struct cache_entry *ce = istate->cache[i]; > > + assert(*ce->name); > if (ce_skip_worktree(ce) && > path_found(ce->name, &last_dirname, &dir_len, &dir_found)) { > if (S_ISSPARSEDIR(ce->ce_mode)) { > > I.e. isn't this undue paranoia about the cache API giving us zero-length > paths? Nope, not related at all, for two reasons: the code above was checking for NULL pointers rather than NUL characters, and the argument I was checking was last_dirname, not ce->name.