Re: RegOops

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Correct. 

So with my case in needing to grab just words,  

[\D]  will grab all words, dashes, hyphens etc.
Same with [\S]. 

In essence grabbing all words if there is nothing else to grab except a number.
However, the shorthand \w does not and would seem (to me) that it should by definition only capture words and not the number 1.
Franks explanation makes some sense to me, but how come it didn't grab the number 0 then? 
If you notice, the 10 got split up.

Best,

Karl DeSaulniers
Design Drumm
http://designdrumm.com



On Oct 23, 2015, at 7:54 PM, German Geek <geek.de@xxxxxxxxx> wrote:

> In regular expressions a backslash capital letter means the opposite. So, \D is NON-digits, \W is NON-word characters and \S is NON-whitespace. You can also do [A-z]* to get all letters in the English language plus the characters between them like ^ and literal \.
> 
> I believe you can also do Unicode ranges with the respective \usomehex, but I haven't tried that yet.
> 
> Tim
> 
> On Sat, 24 Oct 2015 at 12:38 Karl DeSaulniers <karl@xxxxxxxxxxxxxxx> wrote:
> On Oct 23, 2015, at 7:54 AM, Frank Arensmeier <farensmeier@xxxxxxxxx> wrote:
> 
> >
> >> 23 okt 2015 kl. 14:44 skrev Karl DeSaulniers <karl@xxxxxxxxxxxxxxx>:
> >>
> >> Hello all,
> >> With the given string..
> >>
> >> vehicle10-vehicle-name
> >>
> >> Running regex in a preg_match like
> >>
> >> "/(\w+)([0-9+]+)-(.*)/"
> >>
> >> I am getting.
> >>
> >> array(
> >>      0       =>      vehicle10-vehicle-name
> >>      1       =>      vehicle1
> >>      2       =>      0
> >>      3       =>      vehicle-name
> >> )
> >>
> >> If I change it to.
> >>
> >> "/(\D+)([0-9+]+)-(.*)/"
> >>
> >> it works as expected.
> >>
> >> array(
> >>      0       =>      vehicle10-vehicle-name
> >>      1       =>      vehicle
> >>      2       =>      10
> >>      3       =>      vehicle-name
> >> )
> >>
> >> Why is the \w directive including a digit?
> >> Since when is the number 1 a word??
> >>
> >> If anyone could enlighten me, I would greatly appreciate it.
> >>
> >> TIA
> >>
> >> Best,
> >>
> >> Karl DeSaulniers
> >> Design Drumm
> >> http://designdrumm.com
> >>
> >
> > Hi Karl!
> >
> > I am not able to pinpoint the exact definition in the official PCRE documentation right now (http://www.pcre.org). But the short hand \w does in deed include numbers. As you can read here for example (https://docs.oracle.com/javase/tutorial/essential/regex/pre_char_classes.html),
> >
> > \w    A word character: [a-zA-Z_0-9]
> >
> > Although its already Friday, your pattern is working as expected.
> >
> > /frank
> >
> 
> OH, ok, so the \w basically is the shorthand of [a-zA-Z_0-9]?
> That would make sense, however I think it is misleading as there are \D and \S which denote grabbing word and or digits respectfully.
> I thought that \w meant one 'word' character (not digit or special characters or space or new line, just a word),
> or at least that is what I have read in my searches, hence the question here.
> 
> Thank for your response!
> 
> Best,
> 
> Karl DeSaulniers
> Design Drumm
> http://designdrumm.com
> --
> PHP General Mailing List (http://www.php.net/)
> To unsubscribe, visit: http://www.php.net/unsub.php
> 


[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux