René Scharfe <l.s.r@xxxxxx> writes: >> - while (sz && >> - !regexec_buf(regexp, data, sz, 1, ®match, flags)) { >> + while (!regexec_buf(regexp, data, sz, 1, ®match, flags)) { > > This will loop forever for regexes that match an empty string. An > example would be /$/. Silly, perhaps, but still I understand this check > less as an optimization and more as a correctness/robustness thing. > >> flags |= REG_NOTBOL; >> data += regmatch.rm_eo; >> sz -= regmatch.rm_eo; >> - if (sz && regmatch.rm_so == regmatch.rm_eo) { >> + if (regmatch.rm_so == regmatch.rm_eo) { >> data++; >> sz--; >> } > > Before, if the match was an empty string and there was more data after > it, then the code would consume a character anyway, in order to avoid > matching the same empty string again. With the patch, that character > is consumed even if there is no more data. This leaves 'data' > pointing beyond the buffer and 'sz' rolls over to ULONG_MAX. Oops. :( While I do not care too much about NUL in the haystack, I do not mind [13/25] either. But this is bad. This whole thing reminds me of f53c5de2 (pickaxe: fix segfault with '-S<...> --pickaxe-regex', 2017-03-18), by the way. Thanks.