Re: [PHP-WIN] RegEx help needed

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Aug 13, 2003 at 02:48:11PM -0400, Herhuth, Ron wrote:
> I have an array of a submitted text string (called $submittedTextString).
> And I have an array of common words (called $commonWordArray).
> 
> I'm trying to use Regex to:

These aren't necessarily regex problems.
 
> 1. Remove any words from $submittedTextString that appear in
> $commonWordArray (about a hundred words).

My suggestion is to use split() or explode() to split 
$submittedTextString and put the words in an array. Then use 
array_intersection() to return an array of words in both arrays. 
Then you could either loop through the intersection array and 
use str_replace() to replace each word with the empty string, or 
you convert the intersection array to an array that maps each 
word in the array to the empty string, and use that as the map 
argument for strtr()
 
> 2. Remove any numbers from $submittedTextString.

The brute force method is to loop through 0-9, using str_replace()
to replace each digit with the emplty string, whether the digit 
exists or not. Another approach is the search for numbers first, 
and use str_replace() only if you find something. str_replace() 
has to do a search anyway, so a seperate search is probably 
redundant, meaning the brute force method is probably faster. If 
your numbers may include non-numeric characters (e.g. decimal 
points), use [0-9]*[.]?[0-9]+([.][0-9]+)?, or something similar, 
in ereg() to return the number, and pass it to str_replace() to 
replace with the empty string.
 
> 3. Remove any characters from $submittedTextString that aren't
> alphabetical.

Try searching for [^a-zA-Z] and apply str_replace() to anything 
you find. For non-English alphabets you may need to alter that, 
e.g. [^a-åA-Å]. I've only tried PHP regex with English, so I 
don't know.
 
> 4. Remove any duplicate words from $submittedTextString

Use split() or explode(), sort the resulting array with asort() so 
you don't change the indices, and delete any element of the array that 
matches the element immediately before. You should be able to do this 
with either a loop or with array_walk(). Resort the remaining elements 
with ksort() to put them back in their original order, and use join() 
or implode() to convert the array back to a string.

I haven't tried writing any code, so you'll have to figure out the 
details yourself.

Bob Hall

-- 
PHP Windows Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [PHP Users]     [PHP Database Programming]     [PHP Install]     [Kernel Newbies]     [Yosemite Forum]     [PHP Books]

  Powered by Linux