Hi! On Wed, Apr 19, 2023 at 10:23:29PM +0200, Alejandro Colomar wrote: > On 4/19/23 19:48, наб wrote: > > diff --git a/man3/regex.3 b/man3/regex.3 > > index d54d6024c..2c8b87aca 100644 > > --- a/man3/regex.3 > > +++ b/man3/regex.3 > > @@ -141,23 +141,20 @@ compilation flag > > above). > > .TP > > .B REG_STARTEND > > -Use > > -.I pmatch[0] > > -on the input string, starting at byte > > -.I pmatch[0].rm_so > > -and ending before byte > > -.IR pmatch[0].rm_eo . > > +Match > > +.RI [ string " + " pmatch->rm_so ", " string " + " pmatch->rm_eo ) > > +instead of > > +.RI [ string ", " string " + \fBstrlen\fP(" string )). > Hmmm, I like this! > > Let's see if I understand it. pmatch[] is normally > [[gnu::access(write_only, 4, 3)]] > but if ((.eflags & REG_STARTEND) != 0) it's [1] and > [[gnu::access(read_write, 4)]]? I fucked the ternary in my previous mail I think, soz; I don't know if it's gnu::anything, but you could model it as { if(eflags & REG_STARTEND) read(pmatch, 1); if(!(preg->flags & REG_NOSUB)) // as "set" in regcomp() write(pmatch, nmatch); } I.e. pmatch[nmatch] must be a writable array, unless REG_NOSUB, and also, additively, *pmatch must be readable if REG_STARTEND. > > This allows matching embedded NUL bytes > > and avoids a > > .BR strlen (3) > > -on large strings. > > -It does not use > > +on known-length strings. > > .I nmatch > > -on input, and does not change > > -.B REG_NOTBOL > > -or > > -.B REG_NEWLINE > > -processing. > > +is not consulted for this purpose. > > +If any matches are returned, they're relative to > > +.IR string , > > +not > > +.IR string " + " pmatch->rm_so . > How are such matches returned? In pmatch[>0]? Or how? In the usual way in pmatch[0..nmatch]. I guess the "nmatch isn't taken into account" thing is confusing, because REG_STARTEND just adds a read. regexec() can be modelled as { const char * start, * end; if(eflags & REG_STARTEND) { start = string + pmatch->rm_so; end = string + pmatch->rm_eo; } else { start = string; end = string + strlen(string); } // match stuff in [start, end) } And that's the /only/ effect REG_STARTEND has (+ matches are returned relative to string, not to start, but that's consistent, and they just got decoupled; it bears noting it there since it's not what I expected to happen). I'll sleep on this and post something I hate less tomorrow. Best,
Attachment:
signature.asc
Description: PGP signature