On Sun, May 20, 2018 at 3:50 AM, Martin Ågren <martin.agren@xxxxxxxxx> wrote: > It is apparently undefined behavior to call `regfree()` on a regex where > `regcomp()` failed. The language in [1] is a bit muddy, at least to me, > but the clearest hint is this (`preg` is the `regex_t *`): > > Upon successful completion, the regcomp() function shall return 0. > Otherwise, it shall return an integer value indicating an error as > described in <regex.h>, and the content of preg is undefined. > > Funnily enough, there is also the `regerror()` function which should be > given a pointer to such a "failed" `regex_t` -- the content of which > would supposedly be undefined -- and which may investigate it to come up > with a detailed error message. > > In any case, the example in that document shows how `regfree()` is not > called after `regcomp()` fails. > > We have quite a few users of this API and most get this right. These > three users do not. > > Several implementations can handle this just fine [2] and these code paths > supposedly have not wreaked havoc or we'd have heard about it. (These > are all in code paths where git got bad input and is just about to die > anyway.) But let's just avoid the issue altogether. > > [1] http://pubs.opengroup.org/onlinepubs/9699919799/functions/regcomp.html > > [2] https://www.redhat.com/archives/libvir-list/2013-September/msg00262.html > > Researched-by: Eric Sunshine <sunshine@xxxxxxxxxxxxxx> > Signed-off-byi Martin Ågren <martin.agren@xxxxxxxxx> > --- > > On 14 May 2018 at 05:05, Eric Sunshine <sunshine@xxxxxxxxxxxxxx> wrote: >> My research (for instance [1,2]) seems to indicate that it would be >> better to avoid regfree() upon failed regcomp(), even though such a >> situation is handled sanely in some implementations. >> >> [1]: https://www.redhat.com/archives/libvir-list/2013-September/msg00276.html >> [2]: https://www.redhat.com/archives/libvir-list/2013-September/msg00273.html > > Thank you for researching this. I think it would make sense to get rid > of the few places we have where we get this wrong (if our understanding > of this being undefined is right). > > diffcore-pickaxe.c | 1 - > grep.c | 2 -- > 2 files changed, 3 deletions(-) > > diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c > index 239ce5122b..800a899c86 100644 > --- a/diffcore-pickaxe.c > +++ b/diffcore-pickaxe.c > @@ -215,7 +215,6 @@ static void regcomp_or_die(regex_t *regex, const char *needle, int cflags) > /* The POSIX.2 people are surely sick */ > char errbuf[1024]; > regerror(err, regex, errbuf, 1024); > - regfree(regex); While the commit message is very clear why we supposedly introduce a leak here, it is hard to be found from the source code (as we only delete code there, so digging for history is not obvious), so maybe /* regfree(regex) is invalid here */ instead? > die("invalid regex: %s", errbuf); > } > } > diff --git a/grep.c b/grep.c > index 65b90c10a3..5e4f3f9a9d 100644 > --- a/grep.c > +++ b/grep.c > @@ -636,7 +636,6 @@ static void compile_fixed_regexp(struct grep_pat *p, struct grep_opt *opt) > if (err) { > char errbuf[1024]; > regerror(err, &p->regexp, errbuf, sizeof(errbuf)); > - regfree(&p->regexp); > compile_regexp_failed(p, errbuf); > } > } > @@ -701,7 +700,6 @@ static void compile_regexp(struct grep_pat *p, struct grep_opt *opt) > if (err) { > char errbuf[1024]; > regerror(err, &p->regexp, errbuf, 1024); > - regfree(&p->regexp); > compile_regexp_failed(p, errbuf); > } > } > -- > 2.17.0.840.g5d83f92caf >