On Sat, Oct 02 2021, Ævar Arnfjörð Bjarmason wrote: > On Fri, Oct 01 2021, Elijah Newren wrote: > >> On Fri, Oct 1, 2021 at 1:47 AM Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> wrote: >>> >>> On Thu, Sep 30 2021, Elijah Newren wrote: >>> >>> > On Thu, Sep 30, 2021 at 7:15 AM Ævar Arnfjörð Bjarmason >>> > <avarab@xxxxxxxxx> wrote: >>> >> >>> >> On Wed, Sep 29 2021, Elijah Newren wrote: [...] >>> > I might be going on a tangent here, but looking at that patch, I'm >>> > worried that dir_init() was buggy and that you perpetuated that bug >>> > with DIR_INIT. Note that dir_struct has a struct strbuf basebuf >>> > member, which neither dir_init() or DIR_INIT initialize properly >>> > (using either strbuf_init() or STRBUF_INIT). As far as I can tell, >>> > dir.c relies on either strbuf_add() calls to just happen to work with >>> > this incorrectly initialized strbuf, or else use the strbuf_init() >>> > call in prep_exclude() to do so, using the following snippet: >>> > >>> > if (!dir->basebuf.buf) >>> > strbuf_init(&dir->basebuf, PATH_MAX); >>> > >>> > However, earlier in that same function we see >>> > >>> > if (stk->baselen <= baselen && >>> > !strncmp(dir->basebuf.buf, base, stk->baselen)) >>> > break; >>> > >>> > So either that function can never have dir->basebuf.buf be NULL and >>> > the strbuf_init() is dead code, or else it's possible for us to >>> > trigger a segfault. If it's the former, it may just be a ticking time >>> > bomb that will transform into the latter with some other change, >>> > because it's not at all obvious to me how dir->basebuf gets >>> > initialized appropriately to avoid that strncmp call. Perhaps there >>> > is some invariant where exclude_stack is only set up by previous calls >>> > to prep_exclude() and those won't set up exclude_stack until first >>> > initializing basebuf. But that really at least deserves a comment >>> > about how we're abusing basebuf, and would probably be cleaner if we >>> > initialized basebuf to STRBUF_INIT. >>> >>> ...because yes, I forgot about that when sending you the diff-on-top, >>> sorry. Yes that's buggy with the diff-on-top I sent you. >> >> That bug didn't come from the diff-on-top you sent me, it came from >> the commit already merged to master -- ce93a4c6127 (dir.[ch]: replace >> dir_init() with DIR_INIT, 2021-07-01), merged as part of >> ab/struct-init on Jul 16. > > Ah, I misunderstood you there. I'll look at that / fix it. Sorry. Just to tie up this loose end: Yes this control flow suck, and I've got some patches to unpack-trees.[ch] & dir.[ch] I'm about to submit to fix it. But just to comment on the existing behavior of the code, i.e. your (above): "So either that function can never have dir->basebuf.buf be NULL and the strbuf_init() is dead code, or else it's possible for us to trigger a segfault.". I hadn't had time to look into it when I said I'd fix it, but now that I have I found thath there's nothing to fix, and this code wasn't buggy either before or after my ce93a4c6127 (dir.[ch]: replace dir_init() with DIR_INIT, 2021-07-01). I.e. we do have the invariant you mentioned. The dir.[ch] API has always relied on the "struct dir_struct" being zero'd out. First with memset() before your eceba532141 (dir: fix problematic API to avoid memory leaks, 2020-08-18), and after my ce93a4c6127 with the DIR_INIT, which both amount to the same thing. We both missed a caller that used neither dir_init() nor uses DIR_INIT now, but it uses "{ 0 }", so it's always zero'd. Now, of course it being zero'd *would* segfault if you feed "dir->basebuf.buf" to strncmp() as you note above, but that code isn't reachable. The structure of that function is (pseudocode): void prep_exclude(...) { struct exclude_stack *stk = NULL; [...] while ((stk = dir->exclude_stack) != NULL) /* the strncmp() against "dir->basebuf.buf" is here */ /* maybe we'll early return here */ if (!dir->basebuf.buf) strbuf_init(&dir->basebuf, PATH_MAX); /* * Code that sets dir->exclude_stack to non-NULL for the first * time follows... */ } I.e. dir->exclude_stack is *only* referenced in this function and dir_clear() (where we also check it for NULL first). It's state management between calls to prep_exclude(). So that that initial while-loop can only be entered the the >1th time prep_exclude() is called. We'll then either have reached that strbuf_init() already, or if we took an early return before the strbuf_init() we couldn't have set dir->exclude_stack either. So that "dir->basebuf.buf" dereference is safe in either case.