Commenting on the replacing-numbers part, the simplest solution is probably to do a regex replace to replace 0-9 with a blank string. This would cut the entire search-replace to one line of code. Regards, Adam. "Bob Hall" <rjhjr@cox.net> wrote in message 20030813235558.GA50475@kongemord.krig.net">news:20030813235558.GA50475@kongemord.krig.net... > On Wed, Aug 13, 2003 at 02:48:11PM -0400, Herhuth, Ron wrote: > > I have an array of a submitted text string (called $submittedTextString). > > And I have an array of common words (called $commonWordArray). > > > > I'm trying to use Regex to: > > These aren't necessarily regex problems. > > > 1. Remove any words from $submittedTextString that appear in > > $commonWordArray (about a hundred words). > > My suggestion is to use split() or explode() to split > $submittedTextString and put the words in an array. Then use > array_intersection() to return an array of words in both arrays. > Then you could either loop through the intersection array and > use str_replace() to replace each word with the empty string, or > you convert the intersection array to an array that maps each > word in the array to the empty string, and use that as the map > argument for strtr() > > > 2. Remove any numbers from $submittedTextString. > > The brute force method is to loop through 0-9, using str_replace() > to replace each digit with the emplty string, whether the digit > exists or not. Another approach is the search for numbers first, > and use str_replace() only if you find something. str_replace() > has to do a search anyway, so a seperate search is probably > redundant, meaning the brute force method is probably faster. If > your numbers may include non-numeric characters (e.g. decimal > points), use [0-9]*[.]?[0-9]+([.][0-9]+)?, or something similar, > in ereg() to return the number, and pass it to str_replace() to > replace with the empty string. > > > 3. Remove any characters from $submittedTextString that aren't > > alphabetical. > > Try searching for [^a-zA-Z] and apply str_replace() to anything > you find. For non-English alphabets you may need to alter that, > e.g. [^a-åA-Å]. I've only tried PHP regex with English, so I > don't know. > > > 4. Remove any duplicate words from $submittedTextString > > Use split() or explode(), sort the resulting array with asort() so > you don't change the indices, and delete any element of the array that > matches the element immediately before. You should be able to do this > with either a loop or with array_walk(). Resort the remaining elements > with ksort() to put them back in their original order, and use join() > or implode() to convert the array back to a string. > > I haven't tried writing any code, so you'll have to figure out the > details yourself. > > Bob Hall --- Outgoing mail is certified Virus Free. Checked by AVG anti-virus system (http://www.grisoft.com). Version: 6.0.509 / Virus Database: 306 - Release Date: 12/08/2003 -- PHP Windows Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php