Re: [PATCH 0/5] const-correctness in grep.c

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Sep 21, 2021 at 02:07:08PM +0200, Ævar Arnfjörð Bjarmason wrote:

> This whole thing looks good to me. I only found a small whitespace nit
> in one of the patches. Did you consider following-up by having this code
> take const char*/const size_t pairs. E.g. starting with something like
> the below.

I do generally find ptr/len pairs to be easier to read, but they also
make it really easy to introduce subtle bugs. E.g., if you consume part
of a buffer, you have to tweak both the ptr and the len. So the current:

	while (word_char(bol[-1]) && bol < eol)
		bol++;

has to become:

	while (word_char(bol[-1] && len > 0) {
		bol++;
		len--;
	}

So I'd be hesitant to churn battle-tested code in such a way for what I
consider to be a pretty minor benefit.

I did notice the ugly use of "unsigned long" here in a few places
(rather than size_t). I do think it is worth fixing, but it seemed a
little too far to try to cram into this series (it's obviously touching
the same lines, but it's quite orthogonal semantically).

The other hesitation I had is that the source of this "unsigned long"
pattern is almost certainly the object code (which is much more
important to convert, as it blocks people from having >4GB objects on
Windows). So we might want to just wait for a larger conversion there.
OTOH, I don't think there is any downside to a partial conversion here
in the meantime (because size_t will always be at least as long as
"unsigned long" in practice).

> -static int pcre2match(struct grep_pat *p, const char *line, const char *eol,
> +static int pcre2match(struct grep_pat *p, const char *line, const size_t len,
>  		regmatch_t *match, int eflags)
>  {
>  	int ret, flags = 0;
> @@ -448,11 +448,11 @@ static int pcre2match(struct grep_pat *p, const char *line, const char *eol,
>  
>  	if (p->pcre2_jit_on)
>  		ret = pcre2_jit_match(p->pcre2_pattern, (unsigned char *)line,
> -				      eol - line, 0, flags, p->pcre2_match_data,
> +				      len, 0, flags, p->pcre2_match_data,
>  				      NULL);
>  	else
>  		ret = pcre2_match(p->pcre2_pattern, (unsigned char *)line,
> -				  eol - line, 0, flags, p->pcre2_match_data,
> +				  len, 0, flags, p->pcre2_match_data,
>  				  NULL);

Not related to your point, but these casts are funny now. They are meant
to cast to "unsigned char" pointers to match pcre's signature, but now
they are casting away const-ness, too. That might be worth fixing as
part of this series.

Though should they really be casting to PCRE2_SPTR? The types are opaque
in their API because of the weird multi-width thing, though I find it
hard to imagine us ever using the wider versions of the library.

-Peff



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux