Junio C Hamano <gitster@xxxxxxxxx> writes: > David Kastrup <dak@xxxxxxx> writes: > >>> so something like >>> >>> for (p = buf; p < end; p++) { >>> p = find the end of this line >>> if (!p) >>> break; >>> num++; >>> } >>> >>> perhaps? >> >> Would crash on incomplete last line. > > Hmph, even with "if !p"? Admitted. So we have one _proper_ loop termination condition in the middle of the loop, and we have a snake oil condition at the start of the loop that can, according to the standard, _only_ yield true reliably when p == end (any end > p can only result from undefined behavior when p points to an object of size end - p). The effect of this condition basically is to state "we don't trust memchr to do the right thing when called with 0 as its last argument which can happen at most once". This condition will come about by the _combined_ effect of p = ... _in_ the loop (which is the real iteration advancing p) and p++ hidden in the for statement (which never makes any sense separated from the p = ...). It turns out that a) memchr is provided by a compatibility layer in case it is missing or defective b) other code in Git demands that memchr works correctly with a zero last argument. See, for example, attr.c: if (dirlen <= len) break; cp = memchr(path + len + 1, '/', dirlen - len - 1); This clearly needs to be able to work with dirlen == len + 1. So the loop gets rewritten in a manner where the for statement _pretends_ to loop linearly through buf, but where the loop _body_ has its own _regular_ termination condition it shields from the original for, and p is advanced _independently_. The advancement of p to beyond the next '\n' is distributed to two different places in the loop, and one place is made to look as if it does something else. > But that last round of the loop will be no-op, so "p < end" vs "p <= > end" does not make any difference. It is not even strictly > necessary because memchr() limits the scan to end, but it would > still be a good belt-and-suspenders defensive coding practice, I > would think. It's snake oil making debugging harder. It masks the real action, and it will route around suspected faulty behavior that is wrong according to the standards and not permitted elsewhere in the codebase. Other than that, it just takes additional performance from every iteration. Putting belts and suspenders on a bike looks comforting until the suspenders get caught in the wheelspokes. > which is the same as > > for (p = buf; ; p++) { > *lineno++ = p - buf; > p = memchr... > if (!p) > break; > } > > and the latter has the loop control p++ at where it belongs to. But it's only half the control. As it is clear that we won't get to a state where I'd be willing to have "git blame" pointing to me as the author, I suggest that you either put your belts and suspenders on with a separate commit under your own authorship, or that we drop this altogether. As I stated already: this patch was just provided because the original code offends my sense of aesthetics, so there is no point to it if it offends yours. I'd still recommend fixing the sizeof(int *) with sizeof(int) confusion. > If we wanted to have a belt-and-suspenders safety, we need "p <= > end" here, not "p < end", That would be totally ridiculous since end > p cannot ever happen except by undefined behavior. Pointer inequalities require both pointers to be associated with the same object. > This was fun ;-) At the expense of seriously impacting my motivation to do any further code cleanup on Git. Which is a reasonable tradeoff since your fun is more important to Git's well-being than my one. -- David Kastrup -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html