Re: Fast prefix search?

js <ebgssth@xxxxxxxxx> · Sun, 14 Oct 2007 11:32:50 +0900

On 10/14/07, Nathan Nobbe <quickshiftin@xxxxxxxxx> wrote:
> how big is your dataset; have you tested against a potential data set and
> gotten
> long execution times?

The dataset consists of about several million lines
and I've tested script like above against the dataset.
it took almost a hour.
(perl script that uses set_range finished the job within 2 minutes)

> notice how i used strpos rather than strstr, because i got this tip from the
> docs
> Note: If you only want to determine if a particular needle occurs within
> haystack,
> use the faster and less memory intensive function strpos() instead.

On 10/14/07, Robert Cummings <robert@xxxxxxxxxxxxx> wrote:
> So don't use strstr() use strpos(). Specifically use it like follows:
>
>     if( strpos( $haystack, $prefix ) === 0 )
>     {
>         // it's a prefix.
>     }
>

Great tip. Thank you!

> there are some optimization points as well.  if you use a for loop, i think
> it
> will run a hair faster than the foreach, just make sure to store the length
> of the
> dictionary in an variable and use that as the sentinel control variable.
> also, if
> your search can be case insensitive you can use stripos instead of strpos,
> that will
> probly get you a little speed bump as well.

I'll try.

> if the string methods in a loop are too doggy due to the size of your
> datasets,
> you might write a program that uses berkdb and call it using the shell from
> within php.

that seems a shellscript...

Threre seems no easy solution for this job.
The best bet could be to write up do-it-your-self B-tree
implementation in PHP...

Anyway, thank you. your advice extended my knowledge a little :)

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php