Johannes Altmanninger <aclopte@xxxxxxxxx> writes: > diff -I<regex> suppresses hunks where all +/- lines match <regex>. > it is useful to filter away boilerplate changes. > > Unfortunately, it doesn't help if a hunk has a blank line, because the one > obvious way to filter out blank lines (^$) match *every* line, surprisingly. > This is because for a line "01\n" > we have a zero-width match here ^$ (offset 3). > > IOW, while we succesfully ignore deleted blank lines > > printf '\n' | git diff --no-index - /dev/null -I'^$' > diff --git a/- b/- > deleted file mode 100644 > > we also ignore non-blank lines (very surprising) > > printf 'non-blank-line\n' | git diff --no-index - /dev/null -I'^$' > diff --git a/- b/- > deleted file mode 100644 > > unless they don't end in a newline (special case) > > printf 'line without ending newline' | git diff --no-index - /dev/null -I'^$' > diff --git a/- b/- > deleted file mode 100644 > --- a/- > +++ /dev/null > @@ -1 +0,0 @@ > -line without ending newline > \ No newline at end of file > > This patch fixes the second example. Is this the right direction? I do not know where in the code the breakage in the first example comes from. It sounds like a bug if a pattern is not matched honoring the anchor, whether the beginning-of-line "^" or the end-of-line "$" one. > Do we want to honor core.eol, so we preserve the \r when we have Unix endings? Absolutely. I think the code paths that work on blob data (i.e. Git internal, before the smudge operation is applied to externalize it to a working tree file) should only consider LF as the end of line (and any existing code that doesn't is a bug). > In any case -I<regex> won't be able to discern between "line\n" and "line" > but that's not important to me. > > diff --git a/xdiff/xdiffi.c b/xdiff/xdiffi.c > index a4542c05b6..23325022b9 100644 > --- a/xdiff/xdiffi.c > +++ b/xdiff/xdiffi.c > @@ -1016,10 +1016,17 @@ static void xdl_mark_ignorable_lines(xdchange_t *xscr, xdfenv_t *xe, long flags) > static int record_matches_regex(xrecord_t *rec, xpparam_t const *xpp) { > regmatch_t regmatch; > int i; > + const char *end = rec->ptr + rec->size; > + > + if (rec->size >= 2 && end[-2] == '\r' && end[-1] == '\n') { > + end -= 2; > + } else if (rec->size && end[-1] == '\n') { > + end -= 1; > + } > > for (i = 0; i < xpp->ignore_regex_nr; i++) > - if (!regexec_buf(xpp->ignore_regex[i], rec->ptr, rec->size, 1, > - ®match, 0)) > + if (!regexec_buf(xpp->ignore_regex[i], rec->ptr, > + end - rec->ptr, 1, ®match, 0)) > return 1; > > return 0;