Re: [PATCH] grep: add --max-count command line option

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi list,

Thanks to everyone who provided feedback :)

On Saturday, May 14th, 2022 at 20:16, Martin Ågren <martin.agren@xxxxxxxxx> wrote:
> I think this should be
>
> [(-m | --max-count) <num>]

> Please use `backticks` with `-v` and `--invert-match` so that they are
> set in monospace.

I will add these suggestions to my patch.

> Regarding the special value 0, it's a bit unclear what "has no effect"
> means. In particular, it can have an effect in the sense that when it
> is used like
>
> git grep -m 1 -m 0 foo
>
> it undoes the `-m 1`.
>
> But also, that's not how my grep(1) behaves: with `-m 0`, it limits the
> number of matches to zero. I don't know how useful that is (can that
> zero-case be optimized by exiting with 1 before even trying to find the
> needle!?), or if maybe different variants of grep handle this
> differently? If all grep implementations handle 0 by actually only
> emitting zero hits, I think it would be wise for us to handle 0 the same
> way.

I agree the wording is not clear. I did not see a good use case for GNU's `-m 0`, which is why I used that value as unlimited. I am not sold on using `--no-max-count` or -1 *just* for consistency, but if someone can point to a good use case of GNU's `-m 0` (especially in git grep), I will gladly concede.

> Even if we want to handle the zero just like you do, I think this patch
> needs a few tests. We should make sure to test the 0-case (whatever we
> end up wanting it to behave like), and probably the "suppress an earlier
> -m by giving --no-max-count" case. It also seems wise to set up some
> test scenario where there are several files involved so that we can see
> that we don't just print the first m matches globally, but that the
> counter is really handled per file.

This seems sound. Is there any documentation on how to write tests for git?

On Monday, May 16th, 2022 at 07:57, Junio C Hamano <gitster@xxxxxxxxx> wrote:
> Good thing that this is defined as "per-file" limit. If it were a
> global limit, the interaction between this one and "--threads=<num>"
> would have been interesting. Perhaps add a test to make sure the
> feature continues to work with "--threads=2" (I am assuming that you
> have already tested this implementation works with the option).

I did and I found no unexpected behavior.

> What "git grep -m -1" should do? IIRC, OPT_INTEGER is for signed
> integer but the new .max_count member, as well as the existing
> "count" that is compared with it, are of "unsigned" type. Either
> erroring out or treating it as unlimited is probably fine, but
> whatever we do, we should document and have a test for it.

I would favor treating it as an error. As mentioned above, using 0 to describe "unlimited matches" (e.g. the default) is my preference, but I am willing to concede if someone can think of a good use for `-m 0`. Also, from the implementation side (although not as important) it looks better: if we allow negative values, we need to distinguish between -1 (unlimited) and -4 (display error to user, probably) - the patch is much simpler right now. And just as a side note, we avoid an issue in the pretty much insignificant use case of giving a very big value (UINT_MAX) for `-m` and it overflowing into -1, thus not properly limiting the number of matches.

On Monday, May 16th, 2022 at 09:28, Paul Eggert <eggert@xxxxxxxxxxx> wrote:
> As I vaguely recall, if "-m 1" stops before "-m 2" does, then the idea
> was that it's reasonable for "-m 0" to stop before "-m 1" does, and the
> logical place to stop is right at the start, before any matches are
> found (i.e., exit with status 1).

As I mentioned above, I do not see what this `-m 0` behavior is useful for, but if someone could show me an use for it I would appreciate it.

Again, thank you everyone for your comments.




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux