Help with regex: breaking strings down to 'words' and 'phrases'

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi All,

 

I'd very much appreciate some help building a regular expression for
preg_match_all that can differentiate between 'words' and 'phrases'.

 

For example, say I have a string that contains: 'this is an "example of a
phrase"'

 

I'd like to be able to break that down to:

 

this

is

an

example of a phrase

 

My current preg_match_all regex:

 

preg_match_all('([\w\-]+|[\(]|[\)])',"this is an \"example of a
phrase\"',$arr);

 

returns the following:

 

Array

(

    [0] => Array

        (

            [0] => this

            [1] => is

            [2] => an

            [3] => example

            [4] => of

            [5] => a

            [6] => phrase

        )

 

)

 

Note: I'm using this to break elements of a string down to build an sql
string, which is why I'm looking for "(" and ")" characters (ie the
"[\(]|[\)]" part of the regex) and maintaining them in the array. A
real-world example of the the value being supplied to the regex might be
"completed and "January 2005" and not (store or online)" etc. I already have
the logic to handle "and", "or", "not" and "()" but haven't been able to
figure out how to maintain substrings in quotes as a single value in the
array.

 

Any help appreciated!

 

 

Much warmth,

 

Murray

 


[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux