Re: [PATCH] run-command: avoid undefined behavior in exists_in_PATH

Bryan Turner <bturner@xxxxxxxxxxxxx> · Mon, 6 Jan 2020 19:41:30 -0800



On Mon, Jan 6, 2020 at 7:40 PM Bryan Turner <bturner@xxxxxxxxxxxxx> wrote:
>
> On Mon, Jan 6, 2020 at 6:04 PM Jonathan Nieder <jrnieder@xxxxxxxxx> wrote:
> >
> > brian m. carlson wrote:
> >
> > > In this function, we free the pointer we get from locate_in_PATH and
> > > then check whether it's NULL.  However, this is undefined behavior if
> > > the pointer is non-NULL, since the C standard no longer permits us to
> > > use a valid pointer after freeing it.
> > [...]
> > > Noticed-by: Miriam R. <mirucam@xxxxxxxxx>
> > > Signed-off-by: brian m. carlson <sandals@xxxxxxxxxxxxxxxxxxxx>
> > > ---
> > >  run-command.c | 3 ++-
> > >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > Reviewed-by: Jonathan Nieder <jrnieder@xxxxxxxxx>
> >
> > This API that forces the caller to deal with the allocated result when
> > they never asked for it seems a bit inconvenient.  Should we clean it up
> > a little?  Something like this (on top):
> >
> > -- >8 --
> > Subject: run-command: let caller pass in buffer to locate_in_PATH
> >
> > Instead of returning a buffer that the caller is responsible for
> > freeing, use a strbuf output parameter to record the path to the
> > searched-for program.
> >
> > This makes ownership a little easier to reason about, since the owning
> > code declares the buffer.  It's a good habit to follow because it
> > allows buffer reuse when calling such a function in a loop.
> >
> > It also allows the caller exists_in_PATH that does not care about the
> > path to the command to be slightly simplified, by allowing a NULL
> > output parameter that means that locate_in_PATH should take care of
> > allocating and freeing its temporary buffer.
> >
> > Signed-off-by: Jonathan Nieder <jrnieder@xxxxxxxxx>
> > ---
> >  run-command.c | 51 +++++++++++++++++++++++++++++----------------------
> >  1 file changed, 29 insertions(+), 22 deletions(-)
> >
> > diff --git i/run-command.c w/run-command.c
> > index f5e1149f9b..a6dc38396a 100644
> > --- i/run-command.c
> > +++ w/run-command.c
> > @@ -170,52 +170,57 @@ int is_executable(const char *name)
> >   * The caller should ensure that file contains no directory
> >   * separators.
> >   *
> > - * Returns the path to the command, as found in $PATH or NULL if the
> > - * command could not be found.  The caller inherits ownership of the memory
> > - * used to store the resultant path.
> > + * Returns a boolean indicating whether the command was found in $PATH.
> > + * The path to the command is recorded in the strbuf 'out', if supplied.
> >   *
> >   * This should not be used on Windows, where the $PATH search rules
> >   * are more complicated (e.g., a search for "foo" should find
> >   * "foo.exe").
> >   */
> > -static char *locate_in_PATH(const char *file)
> > +static int locate_in_PATH(const char *file, struct strbuf *out)
> >  {
> >         const char *p = getenv("PATH");
> >         struct strbuf buf = STRBUF_INIT;
> >
> > -       if (!p || !*p)
> > -               return NULL;
> > +       if (!out)
> > +               out = &buf;
> > +
> > +       if (!p || !*p) {
> > +               strbuf_reset(out);
>
> Since the while loop and this block both call strbuf_reset(out);, is
> there a reason not to do:
> if (out)
>         strbuf_reset(out);
> } else {
>         out = &buf;
> }
> above? The loop below would still need its own reset, but it could do
> that when it decides to loop.

Ignore my incorrect braces; force of habit. I meant:
if (out)
        strbuf_reset(out);
else
        out = &buf;

>
> > +               strbuf_release(&buf);
> > +               return 0;
> > +       }
> >
> >         while (1) {
> >                 const char *end = strchrnul(p, ':');
> >
> > -               strbuf_reset(&buf);
> > +               strbuf_reset(out);
>
> This reset would be removed
>
> >
> >                 /* POSIX specifies an empty entry as the current directory. */
> >                 if (end != p) {
> > -                       strbuf_add(&buf, p, end - p);
> > -                       strbuf_addch(&buf, '/');
> > +                       strbuf_add(out, p, end - p);
> > +                       strbuf_addch(out, '/');
> >                 }
> > -               strbuf_addstr(&buf, file);
> > +               strbuf_addstr(out, file);
> >
> > -               if (is_executable(buf.buf))
> > -                       return strbuf_detach(&buf, NULL);
> > +               if (is_executable(out->buf)) {
> > +                       strbuf_release(&buf);
> > +                       return 1;
> > +               }
> >
>
> We'd call strbuf_reset(out); here instead, before we break or loop.
>
> >                 if (!*end)
> >                         break;
> >                 p = end + 1
> >         }
> >
> > +       strbuf_reset(out);
>
> And we could leave this one out; if we make it here, we'd know the
> buffer was already reset.
>
> >         strbuf_release(&buf);
> > -       return NULL;
> > +       return 0;
> >  }
> >
> >  static int exists_in_PATH(const char *file)
> >  {
> > -       char *r = locate_in_PATH(file);
> > -       int found = r != NULL;
> > -       free(r);
> > -       return found;
> > +       return locate_in_PATH(file, NULL);
> >  }
> >
> >  int sane_execvp(const char *file, char * const argv[])
> > @@ -427,15 +432,17 @@ static int prepare_cmd(struct argv_array *out, const struct child_process *cmd)
> >          * directly.
> >          */
> >         if (!strchr(out->argv[1], '/')) {
> > -               char *program = locate_in_PATH(out->argv[1]);
> > -               if (program) {
> > -                       free((char *)out->argv[1]);
> > -                       out->argv[1] = program;
> > -               } else {
> > +               struct strbuf program = STRBUF_INIT;
> > +
> > +               if (!locate_in_PATH(out->argv[1], &program)) {
> >                         argv_array_clear(out);
> > +                       strbuf_release(&program);
> >                         errno = ENOENT;
> >                         return -1;
> >                 }
> > +
> > +               free((char *)out->argv[1]);
> > +               out->argv[1] = strbuf_detach(&program, NULL);
> >         }
> >
> >         return 0;
>
> Just a thought. (Pardon the noise from the peanut gallery!)
>
> Best regards,
> Bryan Turner