On Fri, Feb 02, 2018 at 11:04:01AM -0500, R. G. Newbury wrote:
A bug in regx handling???
I am cleaning up some html code
.....
# grep -h '[0-9]*s[0-9]*">' temp
>> Returns the example line with the 's[0-9]">' highlighted.
Can anyone explain what is happening?. This isn't politics so the group
[0-9] should not equal [0-9"#]. Or even [0-9\"\#].
.
Fri, 2 Feb 2018 10:14:37 -0600 From: Chris Adams <linux@xxxxxxxxxxx>
A * in a regex is "0 or more of the previous", so basically you are just
matching 's[0-9]*">' (because there will always be at least 0 of the
[0-9] part at the start).
If you really mean "1 or more", you can use an extended regex (the -E
argument to grep/sed) and use + instead of *, so '[0-9]+s[0-9]*">'.
Fri, 02 Feb 2018 16:15:37 +0000 From: Patrick O'Callaghan
In grep, * matches any number of instances, including 0. You want to
use + rather than * to guarantee at least one digit.
Date: Fri, 2 Feb 2018 11:26:02 -0500 > From: Jon LaBadie<jonfu@xxxxxxxxxx>
You are misunderstanding the "*". It means any sequence of the
associated character including a ZERO length sequence.
So [0-9]*s matches "s (actually just the s) as is is a zero length
sequence of digits followed by an s. When you grep for [0-9]s, there
must be at least one digit before the s (but any extra digits are not
part of the match). Sometimes the sequence [0-9][0-9]*s is useful to
say "one or more digits before the s".
jl
Thanks to all for the quick responses. I *tried* to RTFM but that was
not clear, even on a re-read. I took [0-9]* as multiple instances of
[0-9] but NOT zero instances..
Geoff
_______________________________________________
users mailing list -- users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to users-leave@xxxxxxxxxxxxxxxxxxxxxxx