Hi Peff, On Tue, 6 Sep 2016, Jeff King wrote: > On Mon, Sep 05, 2016 at 05:45:09PM +0200, Johannes Schindelin wrote: > > > Before calling regexec() on the file contents, we better be certain that > > the strings fulfill the contract of C strings assumed by said function. > > If you have a buffer that is exactly "size" bytes and you are worried > about regexec reading off the end, then... > > > diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c > > index 55067ca..88820b6 100644 > > --- a/diffcore-pickaxe.c > > +++ b/diffcore-pickaxe.c > > @@ -49,6 +49,8 @@ static int diff_grep(mmfile_t *one, mmfile_t *two, > > xpparam_t xpp; > > xdemitconf_t xecfg; > > > > + assert(!one || one->ptr[one->size] == '\0'); > > + assert(!two || two->ptr[two->size] == '\0'); > > if (!one) > > return !regexec(regexp, two->ptr, 1, ®match, 0); > > ...don't your asserts also read off the end? Yes, they would read off the end, *unless* a NUL was somehow appended to the buffers. > So you might still segfault, though you do catch a case where we have N > bytes of junk before the end of the page (and you have a 255/256 chance > of catching it). Right. The assertion may fail, or a segfault happen. In both cases, assumptions are violated and we need to fix the code. Ciao, Dscho