On Thu, Nov 23, 2017 at 10:41 AM, Marc-Antoine Ruel <maruel@xxxxxxxxxxxx> wrote: > This tells git grep to skip files longer than a specified length, s/files/lines/ > which is often the result of generators and not actual source files. > > Signed-off-by: Marc-Antoine Ruel <maruel@xxxxxxxxxxxx> > --- > diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt > @@ -127,6 +128,10 @@ OPTIONS > +-M<num>:: > +--max-line-len<num>:: > + Match the pattern only for line shorter or equal to this length. This documentation doesn't explain what it means by a line's length. This implementation seems to take into consideration only the line's byte count, however, given that displayed lines are normally tab-expanded, people might intuitively expect that expansion to count toward the length. A similar question arises for Unicode characters. Should this option take tab-expansion and Unicode into account? Regardless of the answer to that question, the documentation should make it clear what "line length" means. > diff --git a/builtin/grep.c b/builtin/grep.c > @@ -796,6 +796,8 @@ int cmd_grep(int argc, const char **argv, const char *prefix) > N_("case insensitive matching")), > OPT_BOOL('w', "word-regexp", &opt.word_regexp, > N_("match patterns only at word boundaries")), > + OPT_INTEGER('M', "max-line-len", &opt.max_line_length, > + N_("ignore lines longer than <n>")), On this project, it is typical to have only the long form of an option name when the option is first implemented so as not to squat on one of the relatively limited number of short option names. Only after it is seen that an option is popular, does it get a short name. Whether this actually matters in this case, I don't know, but is something to take into consideration. Why the choice of 'M'? Is there precedence in other Git or Unix commands for that name over, say, 'N' or <choose-your-favorite-letter>? Is 'M' is for "max"? If so, what would a short name for --max-depth be (if it ever got one)? (These are somewhat rhetorical questions, but I'm interested in the answers if you have any.) > diff --git a/grep.c b/grep.c > @@ -1273,6 +1275,8 @@ static int match_line(struct grep_opt *opt, char *bol, char *eol, > + if (opt->max_line_length > 0 && eol-bol > opt->max_line_length) > + return 0; If the user specifies "-M0", should that error out or at least warn the user that the value is non-sensical? What about -1, etc.? (These are UX-related questions; the implementation obviously doesn't care one way or the other.) > diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh > @@ -766,6 +766,11 @@ test_expect_success 'grep -W shows no trailing empty lines' ' > +test_expect_success 'grep skips long lines' ' > + git grep -M18 -W include >actual && > + test_cmp expected actual > +' Meh, what is this actually testing? The output of "git grep -M18 -W include" is no different that the output of "git grep -W include" in the test just above this one. And, why is -W in this test at all? Its presence confuses the reader into thinking that it is somehow significant. I would have expected this test to look like this: cat >expected <<-\EOF hello.c:#include <stdio.h> EOF test_expect_success 'grep -M skips long lines' ' git grep -M18 include >actual && test_cmp expected actual '