Am 09.08.2017 um 07:29 schrieb Junio C Hamano: > René Scharfe <l.s.r@xxxxxx> writes: > >> Am 09.08.2017 um 00:26 schrieb Junio C Hamano: >>> ... but in the meantime, I think replacing the test with "0$" to >>> force the scanner to find either the end of line or the end of the >>> buffer may be a good workaround. We do not have to care how many of >>> random bytes are in front of the last "0" in order to ensure that >>> the regexec_buf() does not overstep to 4097th byte, while seeing >>> that regexec() that does not know how long the haystack is has to do >>> so, no? >> >> Our regexec() calls strlen() (see my other reply). >> >> Using "0$" looks like the best option to me. > > Yeah, it seems that way. If we want to be close/faithful to the > original, we could do "^0*$", but the part that is essential to > trigger the old bug is not the "we have many zeroes" (or "we have > 4096 zeroes") part, but "zero is at the end of the string" part, so > "0$" would be the minimal pattern that also would work for OBSD. Thought about it a bit more. "^0{4096}$" checks if the byte after the buffer is \n or \0 in the hope of triggering a segfault. On Linux I can access that byte just fine; perhaps there is no guard page. Also there is a 2 in 256 chance of the byte being \n or \0 (provided its value is random), which would cause the test to falsely report success. "0$" effectively looks for "0\n" or "0\0", which can only occur after the buffer. If that string is found close enough then we may not trigger a segfault and report a false positive. In the face of unreliable segfaults we need to reverse our strategy, I think. Searching for something not in the buffer (e.g. "1") and considering matches and segfaults as confirmation that the bug is still present should avoid any false positives. Right? Thanks, René