"Elijah Newren via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes: > ... > Fix most these problems by making treat_leading_path() not only loop > over each leading path component, but calling treat_path() directly on > each. To do so, we have to create a synthetic dir_entry, but that only > takes a few lines. Then, pay attention to the path_treatment result we > get from treat_path() and don't treat path_excluded, path_untracked, and > path_recurse all the same as path_recurse. > > This leaves one remaining problem, the new inconsistency from commit > df5bcdf83ae. That will be addressed in a subsequent commit. > > Signed-off-by: Elijah Newren <newren@xxxxxxxxx> > --- > dir.c | 57 +++++++++++++++---- > ...common-prefixes-and-directory-traversal.sh | 6 +- > 2 files changed, 49 insertions(+), 14 deletions(-) > > diff --git a/dir.c b/dir.c > index 645b44ea64..1de5d7ad33 100644 > --- a/dir.c > +++ b/dir.c > @@ -2102,37 +2102,72 @@ static int treat_leading_path(struct dir_struct *dir, > const struct pathspec *pathspec) > { > struct strbuf sb = STRBUF_INIT; > - int baselen, rc = 0; > + int prevlen, baselen; > const char *cp; > + struct cached_dir cdir; > + struct dirent *de; > + enum path_treatment state = path_none; > + > + /* > + * For each directory component of path, we are going to check whether > + * that path is relevant given the pathspec. For example, if path is > + * foo/bar/baz/ > + * then we will ask treat_path() whether we should go into foo, then > + * whether we should go into bar, then whether baz is relevant. > + * Checking each is important because e.g. if path is > + * .git/info/ > + * then we need to check .git to know we shouldn't traverse it. > + * If the return from treat_path() is: > + * * path_none, for any path, we return false. > + * * path_recurse, for all path components, we return true > + * * <anything else> for some intermediate component, we make sure > + * to add that path to the relevant list but return false > + * signifying that we shouldn't recurse into it. > + */ > > while (len && path[len - 1] == '/') > len--; > if (!len) > return 1; > + > + de = xcalloc(1, sizeof(struct dirent)+len+1); That "+len+1" may deserve a comment? If we wanted to shoot for the minimum memory consumption (and we do not), we would probably allocate (sizeof(struct dirent) - sizeof(de->d_name)) + max(sizeof(de->d_name), len + 1) bytes, but unconditionally adding len+1 is simpler and easier to understand. Either way, we *are* relying on the assumption that either: (1) the "struct dirent" would have d_name[] array at the end of the struct, and by over-allocating, we can safely fit and carry a name that is much longer than sizeof(.d_name[]); OR (2) the "struct dirent" has d_name[] that is large enough to hold len+1 bytes, if the assumption (1) does not hold. is true. > + memset(&cdir, 0, sizeof(cdir)); > + cdir.de = de; > +#if defined(DT_UNKNOWN) && !defined(NO_D_TYPE_IN_DIRENT) > + de->d_type = DT_DIR; > +#endif > baselen = 0; > + prevlen = 0; > while (1) { > - cp = path + baselen + !!baselen; > + prevlen = baselen + !!baselen; > + cp = path + prevlen; > cp = memchr(cp, '/', path + len - cp); > if (!cp) > baselen = len; > else > baselen = cp - path; > - strbuf_setlen(&sb, 0); > + strbuf_reset(&sb); > strbuf_add(&sb, path, baselen); > if (!is_directory(sb.buf)) > break; > - if (simplify_away(sb.buf, sb.len, pathspec)) > - break; > - if (treat_one_path(dir, NULL, istate, &sb, baselen, pathspec, > - DT_DIR, NULL) == path_none) > + strbuf_reset(&sb); > + strbuf_add(&sb, path, prevlen); > + memcpy(de->d_name, path+prevlen, baselen-prevlen); > + de->d_name[baselen-prevlen] = '\0'; > + state = treat_path(dir, NULL, &cdir, istate, &sb, prevlen, > + pathspec); So this is the crux fo the fix---instead of doing a (poor) imitation of what treat_path() does by calling simplify_away() and treat_one_path() ourselves, we make a call to the real thing. Looking good. Thanks.