On Sun, Dec 15, 2019 at 2:29 AM Johannes Schindelin <Johannes.Schindelin@xxxxxx> wrote: > > Hi Elijah, > > I have not had time to dive deeply into this, but I know that it _does_ > cause a ton of segmentation faults in the `shears/pu` branch (where all of > Git for Windows' patches are rebased on top of `pu`): Weird. If it's going to cause segmentation faults at all, it would certainly do it all over the place, but I tested the patches on the major platforms using your Azure Pipelines setup on git.git so it should be good on all the platforms. Did your shears/pu branch make some other changes to the setup? > On Tue, 10 Dec 2019, Elijah Newren via GitGitGadget wrote: > > > diff --git a/dir.c b/dir.c > > index 645b44ea64..9c71a9ac21 100644 > > --- a/dir.c > > +++ b/dir.c > > @@ -2102,37 +2102,69 @@ static int treat_leading_path(struct dir_struct *dir, > > const struct pathspec *pathspec) > > { > > struct strbuf sb = STRBUF_INIT; > > - int baselen, rc = 0; > > + int prevlen, baselen; > > const char *cp; > > + struct cached_dir cdir; > > + struct dirent de; > > + enum path_treatment state = path_none; > > + > > + /* > > + * For each directory component of path, we are going to check whether > > + * that path is relevant given the pathspec. For example, if path is > > + * foo/bar/baz/ > > + * then we will ask treat_path() whether we should go into foo, then > > + * whether we should go into bar, then whether baz is relevant. > > + * Checking each is important because e.g. if path is > > + * .git/info/ > > + * then we need to check .git to know we shouldn't traverse it. > > + * If the return from treat_path() is: > > + * * path_none, for any path, we return false. > > + * * path_recurse, for all path components, we return true > > + * * <anything else> for some intermediate component, we make sure > > + * to add that path to the relevant list but return false > > + * signifying that we shouldn't recurse into it. > > + */ > > > > while (len && path[len - 1] == '/') > > len--; > > if (!len) > > return 1; > > + > > + memset(&cdir, 0, sizeof(cdir)); > > + memset(&de, 0, sizeof(de)); > > + cdir.de = &de; > > + de.d_type = DT_DIR; > > So here, `de` is zeroed out, and therefore `de.d_name` is `NULL`. Um, yeah...didn't I have an allocation of de.d_name here? It will always have a subset of path copied into it, so an allocation of len+1 is plenty long enough. > > baselen = 0; > > + prevlen = 0; > > while (1) { > > - cp = path + baselen + !!baselen; > > + prevlen = baselen + !!baselen; > > + cp = path + prevlen; > > cp = memchr(cp, '/', path + len - cp); > > if (!cp) > > baselen = len; > > else > > baselen = cp - path; > > - strbuf_setlen(&sb, 0); > > + strbuf_reset(&sb); > > strbuf_add(&sb, path, baselen); > > if (!is_directory(sb.buf)) > > break; > > - if (simplify_away(sb.buf, sb.len, pathspec)) > > - break; > > - if (treat_one_path(dir, NULL, istate, &sb, baselen, pathspec, > > - DT_DIR, NULL) == path_none) > > + strbuf_reset(&sb); > > + strbuf_add(&sb, path, prevlen); > > + memcpy(de.d_name, path+prevlen, baselen-prevlen); > > But here we try to copy a path into that `de.d_name`, which is still > `NULL`? > > That can't be right, can it? Yes, it can't be right. How did this possibly pass on any platform let alone all of them? (https://dev.azure.com/git/git/_build/results?buildId=1462&view=results). This is absolutely an important codepath that is hit; otherwise it couldn't fix the three tests from failure to success. Further, the subsequent patch added code within this if-block after this memcpy and fixed a few tests from failures to success. So it had to hit this code path as well. How could it not have segfaulted? I'm very confused...