On 10/13/07, js <ebgssth@xxxxxxxxx> wrote: > > On 10/14/07, Nathan Nobbe <quickshiftin@xxxxxxxxx> wrote: > > can you use the php string manipulation functions ? > > I'll probably use strstr() to check whether a string starts with some > prefix. > But problem I like to solve is how to effectively pick strings > starting with a prefix > from a large dataset, like a dictionary. > > If Berkeley DB's set_range were available from PHP, > I could write something like > > $word = dba_set_range($prefix, $dictionary) // get the first word > starting with $prefix > do { > if (!prefix($word) == $prefix) break > $found[] = $word > } while ($word = dba)_nextkey($dictionary)) how big is your dataset; have you tested against a potential data set and gotten long execution times? foreach($dictionary as $curValue) { if((strpos($curValue, $prefix) != false) { $found[] = $curValue; } } notice how i used strpos rather than strstr, because i got this tip from the docs *Note: * If you only want to determine if a particular *needle* occurs within *haystack*, use the faster and less memory intensive function strpos()<http://www.php.net/manual/en/function.strpos.php>instead. there are some optimization points as well. if you use a for loop, i think it will run a hair faster than the foreach, just make sure to store the length of the dictionary in an variable and use that as the sentinel control variable. also, if your search can be case insensitive you can use stripos instead of strpos, that will probly get you a little speed bump as well. if the string methods in a loop are too doggy due to the size of your datasets, you might write a program that uses berkdb and call it using the shell from within php. -nathan