Hi Rich, On Tue, 4 Oct 2016, Rich Felker wrote: > On Tue, Oct 04, 2016 at 06:08:33PM +0200, Johannes Schindelin wrote: > > > And lastly, the best alternative would be to teach musl about > > REG_STARTEND, as it is rather useful a feature. > > Maybe, but it seems fundamentally costly to support -- it's extra > state in the inner loops that imposes costly spill/reload on archs > with too few registers (x86). It is true that it could cause that. I had a brief look at the source code (you use backtracking... hopefully nobody uses musl to parse regular expressions from untrusted, or inexperienced, sources [*1*]), and it seems that the regex code might spill unnecessarily already (I see, for example, that the reg_notbol, reg_noteol and reg_newline flags all use up complete int registers, not merely bits of a single one). It seems, specifically, that the *match_end_ofs parameter of the two regexec backends is always set to point to eo, which is so far not initialized. You could initialize it to -1 and set it to pmatch[0].rm_eo if the REG_STARTEND flag is set. The GET_NEXT_WCHAR() macro would then need to test something like if (str_byte >= string + *match_end_ofs) { ret = REG_NOMATCH; goto error_exit; } This does not handle non-zero pmatch[0].rm_so, though. I would probably try to pass another input parameter for that, but I have not verified yet that a "^" would be handled properly (if pmatch[0].rm_so > 0 and REG_STARTEND is set, "^" should *not* match). > I'll look at doing this when we overhaul/replace the regex > implementation, and I'm happy to do some performance-regression tests > for adding it now if someone has a simple patch (as was mentioned on the > musl list). I'd be interested to be kept in the loop, if you do not mind Cc:ing me. Ciao, Johannes Footnote *1*: http://stackstatus.net/post/147710624694/outage-postmortem-july-20-2016