On Fri, Feb 07, 2020 at 05:51:05PM +0100, Andreas Schwab wrote: > On Feb 07 2020, Eric Sunshine wrote: > > > On Fri, Feb 7, 2020 at 10:09 AM SZEDER Gábor <szeder.dev@xxxxxxxxx> wrote: > >> macOS 'sed', that's what I was missing :) > >> > >> sed -n 's/^\(.*\) \+annotate:bugreport\[include\].* ::$/ "\1",/p' | sort > >> > >> and the 'sed' included in macOS apparently interprets that '\+' > >> differently than GNU 'sed', and as a result won't match anything. > > > > More generally, this would be a problem with any 'sed' of BSD lineage. > > > >> FWIW, that '\+' doesn't seem to be necessary, though, and after > >> removing it the resulting generated array looked good to me [...] > > > > A reasonable replacement for "<SP>\+" would be "<SP><SP>*" (where <SP> > > represents 'space'). > > Another problem with that regexp is that it contains two adjacent > repetitions matching the same character. When there are two or more > spaces before "annotate:" all but the last of them can be matched by > either '\(.*\)' or ' \+'. To fix that '\(.*\)' needs to be modified to > not match a trailing space. Hum. I had assumed since the capture group was not greedy, it would not capture any trailing spaces that could be captured by ' \+'. I don't see a problem in making it non-explicit, though. I'll hack on this some more. I do find myself pretty annoyed at the matcher difference between 'sed' and 'also sed', though, and don't see in the manpage a way to guarantee which matcher 'sed' should use (a la 'grep -[EFGP]'). :) - Emily