Re: [musl] Re: Regression: git no longer works with musl libc's regex impl

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Rich,

On Tue, 4 Oct 2016, Rich Felker wrote:

> On Tue, Oct 04, 2016 at 06:08:33PM +0200, Johannes Schindelin wrote:
>
> > And lastly, the best alternative would be to teach musl about
> > REG_STARTEND, as it is rather useful a feature.
> 
> Maybe, but it seems fundamentally costly to support -- it's extra
> state in the inner loops that imposes costly spill/reload on archs
> with too few registers (x86).

It is true that it could cause that.

I had a brief look at the source code (you use backtracking... hopefully
nobody uses musl to parse regular expressions from untrusted, or
inexperienced, sources [*1*]), and it seems that the regex code might
spill unnecessarily already (I see, for example, that the reg_notbol,
reg_noteol and reg_newline flags all use up complete int registers, not
merely bits of a single one).

It seems, specifically, that the *match_end_ofs parameter of the two
regexec backends is always set to point to eo, which is so far not
initialized. You could initialize it to -1 and set it to pmatch[0].rm_eo
if the REG_STARTEND flag is set. The GET_NEXT_WCHAR() macro would then
need to test something like

	if (str_byte >= string + *match_end_ofs) {
		ret = REG_NOMATCH; goto error_exit;
	}

This does not handle non-zero pmatch[0].rm_so, though. I would probably
try to pass another input parameter for that, but I have not verified yet
that a "^" would be handled properly (if pmatch[0].rm_so > 0 and
REG_STARTEND is set, "^" should *not* match).

> I'll look at doing this when we overhaul/replace the regex
> implementation, and I'm happy to do some performance-regression tests
> for adding it now if someone has a simple patch (as was mentioned on the
> musl list).

I'd be interested to be kept in the loop, if you do not mind Cc:ing me.

Ciao,
Johannes

Footnote *1*:
http://stackstatus.net/post/147710624694/outage-postmortem-july-20-2016



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]