On Tue, May 10, 2005 5:58 am, Merlin said:
I am writing an internal full text search engine and do have trouble in outputting the text in an apropriate way.
Problem is that if there is more than one word I cant handle the text cropping.
For example: Search term: php germany Text from database: There is no such great language than php. Amongh those countries using this language is Germany with a good percentage of users. Text output should be: ...language than php. Amongh... language is Germay with a good...
Similar to the way google does it. I tried it now with a couple of ways but failed for most (I tried with strpos and substr).
Is there a solution out of the box with php, or maybe anybody knows a good script which does that. This does sound like a standard feature to me.
Here's a quickie, untested, and probably with some kind of logic errors, or at least things not quite what you want.
$fulltext = "There is no such great language than php. Amongh those countries using this language is Germany with a good percentage of users."; $words = explode(" ", "php germany"); $snippets = ''; while (list(, $word) = each($words)){ if (!stristr($snippets, $word)){ //skip this if we already got the word. $start = strpos($fulltext, $word); if ($start !== false){ $end = $start + strlen($word); $jumpback = strpos(' ', $fulltext, $start - 20); $jumpforward = strpos(' ', $fulltext, $end + 20); $snippet = substr($fulltext, $jumpback, $jumpforward); $snippets .= " $snippet "; } } } reset($words); while (, $word) = each($words)){ $snippets = str_replacei($word, "<b>$word</b>", $snippets); } echo $snippets;
To Do:
Might wanna store an array of start/end numbers for snippets, then sort by start, then combine those that "overlap" one end to the next start, *THEN* combine those snippets, so you don't have snippets out or order, nor overlapping.
Still, I got ya started...
Hi Richard,
thank you for the jump start! I have fixed some errors within the script and now it works ( I am attaching the script). You are right, the overlap is a to do.
Best regards, Merlin
<?php
$fulltext = "There is no such great language than php. Amongh those countries using this language is Germany with a good percentage of users."; $words = explode(" ", "php Germany"); $snippets = ''; while (list(, $word) = each($words)){ if (!stristr($snippets, $word)){ //skip this if we already got the word. $start = strpos($fulltext, $word); if ($start !== false){ $end = $start + strlen($word); $jumpback = strpos($fulltext, ' ',$start - 20); $jumpforward = strpos( $fulltext,' ', $end + 20); $snippet = '...'.substr($fulltext, $jumpback, $jumpforward).'...'; $snippets .= " $snippet "; } } } reset($words);
while (list(, $word) = each($words)){ $snippets = str_replace($word, "<b>$word</b>", $snippets); }
echo $snippets;
?>
-- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php