On Fri, Feb 02, 2018 at 11:04:01AM -0500, R. G. Newbury wrote: > A bug in regx handling??? > > I am cleaning up some html code, using sed to standardize the formatting. I > was searching for specific instances of code to amend using grep. > I was looking for instances like <a name="s1s1"> > Example text in a file: ( here named, quite originally, temp ) > <p class="section-f"></a><a name="s8"></a>8.</b></a> > > And # grep -h '[0-9]s[0-9]*">' temp > Returns nothing (which is the expected result: there are no [0-9]s[0-9}"> > instances. > > BUT!!! > # grep -h '[0-9]*s[0-9]*">' temp > Returns the example line with the 's[0-9]">' highlighted. > > Note that the character before the 's' is either " or # > > Can anyone explain what is happening?. This isn't politics so the group > [0-9] should not equal [0-9"#]. Or even [0-9\"\#]. You are misunderstanding the "*". It means any sequence of the associated character including a ZERO length sequence. So [0-9]*s matches "s (actually just the s) as is is a zero length sequence of digits followed by an s. When you grep for [0-9]s, there must be at least one digit before the s (but any extra digits are not part of the match). Sometimes the sequence [0-9][0-9]*s is useful to say "one or more digits before the s". jl -- Jon H. LaBadie jonfu@xxxxxxxxxx _______________________________________________ users mailing list -- users@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to users-leave@xxxxxxxxxxxxxxxxxxxxxxx