On Thu, Mar 03 2022, Jaydeep Das wrote: > How about modifying the number match regex to: > > `[0-9._]+([Ee][-+]?[0-9]+)?[fFlLuU]*[^a-zA-Z]` ? > > The `[^a-zA-Z]` in the end would make sure to not match > the `.F` in `X.Find`. > > Additionally, we can add another regex for matching just > the method calls: > > `[.][a-zA-Z()0-9]+` > > Both of these changes would make word_regex match 2 tokens in > X.Find() : X and .Find() (Here X can be any valid identifier name) > > >> How many tokens will the word-regex find in the expression X.e+200UL? >> .e+200UL is a single token. > It's most easily fixed by requiring a digit before the fullstop. But if >> floatingpoint numbers can begin with a fullstop, then we need a second >> expression that requires a digit after a leading fullstop. > > But that syntax would be wrong. I tried making a condition like you said, > but it always ended up breaking something else(like breaking 2.e+200UL into 2, .e, + and 200UL) > > Also, I realized I did a bit of mistake in the identifier regex. > Both _abc and __abc are valid identifiers. _3432, __3232 are valid identifiers too.(not numbers) > > The previous regex matched only one `_`, so in the next patch, > I plan to implement the following regex: > > Identifier: `([_]*[a-zA-Z]|[_]+[0-9]+)[a-zA-Z0-9_]*` > > Numbers: `[0-9_.]+([Ee][-+]?[0-9]+)?[fFlLuU]*[^a-zA-Z]` > (It makes sure that in X.Find, .F is not matched ) > > Additionally, An extra regex for method calls: > > `[.][a-zA-Z()0-9]+` > > What do you think? Just a small note on rx syntax> [.] can be handy to escape "." (but you can also use "\\.", but that's arguably not as easy to read. But there's no reason to use [_]* over just _*.. (Also, I have an in-flight change to userdiff.c that would conflict, but I wonder if it wouldn't be handy to make the word_regex a "struct userdiff_funcname". Then we could specify icase flags, which in this case would make it a lot easier to read).