Re: [PHP-WIN] RegEx help needed

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Commenting on the replacing-numbers part, the simplest solution is probably
to do a regex replace to replace 0-9 with a blank string. This would cut the
entire search-replace to one line of code.

Regards, Adam.

"Bob Hall" <rjhjr@cox.net> wrote in message
20030813235558.GA50475@kongemord.krig.net">news:20030813235558.GA50475@kongemord.krig.net...
> On Wed, Aug 13, 2003 at 02:48:11PM -0400, Herhuth, Ron wrote:
> > I have an array of a submitted text string (called
$submittedTextString).
> > And I have an array of common words (called $commonWordArray).
> >
> > I'm trying to use Regex to:
>
> These aren't necessarily regex problems.
>
> > 1. Remove any words from $submittedTextString that appear in
> > $commonWordArray (about a hundred words).
>
> My suggestion is to use split() or explode() to split
> $submittedTextString and put the words in an array. Then use
> array_intersection() to return an array of words in both arrays.
> Then you could either loop through the intersection array and
> use str_replace() to replace each word with the empty string, or
> you convert the intersection array to an array that maps each
> word in the array to the empty string, and use that as the map
> argument for strtr()
>
> > 2. Remove any numbers from $submittedTextString.
>
> The brute force method is to loop through 0-9, using str_replace()
> to replace each digit with the emplty string, whether the digit
> exists or not. Another approach is the search for numbers first,
> and use str_replace() only if you find something. str_replace()
> has to do a search anyway, so a seperate search is probably
> redundant, meaning the brute force method is probably faster. If
> your numbers may include non-numeric characters (e.g. decimal
> points), use [0-9]*[.]?[0-9]+([.][0-9]+)?, or something similar,
> in ereg() to return the number, and pass it to str_replace() to
> replace with the empty string.
>
> > 3. Remove any characters from $submittedTextString that aren't
> > alphabetical.
>
> Try searching for [^a-zA-Z] and apply str_replace() to anything
> you find. For non-English alphabets you may need to alter that,
> e.g. [^a-åA-Å]. I've only tried PHP regex with English, so I
> don't know.
>
> > 4. Remove any duplicate words from $submittedTextString
>
> Use split() or explode(), sort the resulting array with asort() so
> you don't change the indices, and delete any element of the array that
> matches the element immediately before. You should be able to do this
> with either a loop or with array_walk(). Resort the remaining elements
> with ksort() to put them back in their original order, and use join()
> or implode() to convert the array back to a string.
>
> I haven't tried writing any code, so you'll have to figure out the
> details yourself.
>
> Bob Hall


---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.509 / Virus Database: 306 - Release Date: 12/08/2003



-- 
PHP Windows Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [PHP Users]     [PHP Database Programming]     [PHP Install]     [Kernel Newbies]     [Yosemite Forum]     [PHP Books]

  Powered by Linux