Hi Elijah, On Mon, 16 Dec 2019, Elijah Newren wrote: > On Mon, Dec 16, 2019 at 5:51 AM Elijah Newren <newren@xxxxxxxxx> wrote: > > > > On Sun, Dec 15, 2019 at 2:29 AM Johannes Schindelin > > <Johannes.Schindelin@xxxxxx> wrote: > > > > > > Hi Elijah, > > > > > > I have not had time to dive deeply into this, but I know that it _does_ > > > cause a ton of segmentation faults in the `shears/pu` branch (where all of > > > Git for Windows' patches are rebased on top of `pu`): > > > > Weird. If it's going to cause segmentation faults at all, it would > > certainly do it all over the place, but I tested the patches on the > > major platforms using your Azure Pipelines setup on git.git so it > > should be good on all the platforms. Did your shears/pu branch make > > some other changes to the setup? Not really. > > > On Tue, 10 Dec 2019, Elijah Newren via GitGitGadget wrote: > > > > > > > diff --git a/dir.c b/dir.c > > > > index 645b44ea64..9c71a9ac21 100644 > > > > --- a/dir.c > > > > +++ b/dir.c > > > > @@ -2102,37 +2102,69 @@ static int treat_leading_path(struct dir_struct *dir, > > > > const struct pathspec *pathspec) > > > > { > > > > struct strbuf sb = STRBUF_INIT; > > > > - int baselen, rc = 0; > > > > + int prevlen, baselen; > > > > const char *cp; > > > > + struct cached_dir cdir; > > > > + struct dirent de; > > > > + enum path_treatment state = path_none; > > > > + > > > > + /* > > > > + * For each directory component of path, we are going to check whether > > > > + * that path is relevant given the pathspec. For example, if path is > > > > + * foo/bar/baz/ > > > > + * then we will ask treat_path() whether we should go into foo, then > > > > + * whether we should go into bar, then whether baz is relevant. > > > > + * Checking each is important because e.g. if path is > > > > + * .git/info/ > > > > + * then we need to check .git to know we shouldn't traverse it. > > > > + * If the return from treat_path() is: > > > > + * * path_none, for any path, we return false. > > > > + * * path_recurse, for all path components, we return true > > > > + * * <anything else> for some intermediate component, we make sure > > > > + * to add that path to the relevant list but return false > > > > + * signifying that we shouldn't recurse into it. > > > > + */ > > > > > > > > while (len && path[len - 1] == '/') > > > > len--; > > > > if (!len) > > > > return 1; > > > > + > > > > + memset(&cdir, 0, sizeof(cdir)); > > > > + memset(&de, 0, sizeof(de)); > > > > + cdir.de = &de; > > > > + de.d_type = DT_DIR; > > > > > > So here, `de` is zeroed out, and therefore `de.d_name` is `NULL`. > > > > Um, yeah...didn't I have an allocation of de.d_name here? It will > > always have a subset of path copied into it, so an allocation of len+1 > > is plenty long enough. > > Actually, it looks like I looked up the definition of dirent > previously and forgot by the time you emailed. On linux, from > /usr/include/bits/dirent.h: > > struct dirent > { > .... > unsigned char d_type; > char d_name[256]; /* We must not include limits.h! */ > }; > > and from compat/win32/dirent.h defines it as: > > struct dirent { > unsigned char d_type; /* file type to prevent lstat after > readdir */ > char d_name[MAX_PATH * 3]; /* file name (* 3 for UTF-8 conversion) */ > }; > > and 'man dirent' on Mac OS X says it's defined as: > > struct dirent { > ... > _uint8_t d_type; > _unit8_t d_namlen; /* length of string in d_name */ > char d_name[255+1]; /* name must be no longer than this */ > } > > so, allocating it would be incorrect and my memset would just fill > d_name with nul characters. > > > But the raises the question...what kind of segfaults are you getting? > Can you link to any builds or post any stack traces? Can I duplicate > with some copy of git-for-windows on linux? If you care to look at our very own `compat/win32/dirent.h`, you will see this: struct dirent { unsigned char d_type; /* file type to prevent lstat after readdir */ char *d_name; /* file name */ }; And looking at https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/dirent.h.html, I do not see any guarantee of that `[256]` at all: The <dirent.h> header shall [...] define the structure dirent which shall include the following members: [XSI][Option Start] ino_t d_ino File serial number. [Option End] char d_name[] Filename string of entry. You will notice that not even `d_type` is guaranteed. Ciao, Dscho