On Wed, Oct 05, 2016 at 01:17:49PM +0200, Johannes Schindelin wrote: > Hi Rich, > > On Tue, 4 Oct 2016, Rich Felker wrote: > > > On Tue, Oct 04, 2016 at 06:08:33PM +0200, Johannes Schindelin wrote: > > > > > And lastly, the best alternative would be to teach musl about > > > REG_STARTEND, as it is rather useful a feature. > > > > Maybe, but it seems fundamentally costly to support -- it's extra > > state in the inner loops that imposes costly spill/reload on archs > > with too few registers (x86). > > It is true that it could cause that. > > I had a brief look at the source code (you use backtracking... Where did you get that idea? Backtracking is the most utterly incompetent way to implement regex -- it throws away the whole property that makes regex useful, being regular. Unfortunately, POSIX BRE is not regular, as it contains backreferences, so any implementation of regcomp/regexec requires at least a minimal backtracking code path for BREs that contain backreferences. > hopefully > nobody uses musl to parse regular expressions from untrusted, or On the contrary, musl's is the only system reccomp/regexec I'm aware of that actually attempts to be safe with untrusted input -- when using REG_EXTENDED (ERE). Other implementations provide backreferences in ERE as an extension, making ERE unsafe just like BRE. musl intentionally disallows them as a feature. At least until recently, glibc also crashed on malloc failures in regcomp, making it unsafe on untrusted input for that reason too. Rich