detecting spam keywords with stripos

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi there,

I am matching text against an array of keywords to detect spam. Unfortunatelly there are some false positives due to the fact that stripos also finds the keyword inside a word.
E.G. "Bewerbung" -> "Werbung"

First thought: use strpos, but this does not help in all cases
Second thought: split text into words and use in_array, but this does not find things like "zu Hause" or "flexible/Arbeit"

Does somebody have an idea on how to make my function better in terms of not detecting the string inside a word? Here is the code:

while ($row = db_get_row($result)){
	$keyword[] 	= $row->keyword;
	$weight[]	= $row->weight;
};	
$num_results = db_numrows($result); 	

for ($i=0;$i<$num_results;$i++){
	$findme  = $keyword[$i];
	$pos = stripos($data[txt], $findme);
	$pos2 = stripos($data[title], $findme);
	if ($pos !== false OR $pos2 !== false){ // spam!
		$spam_level += $weight[$i];
		$triggered_keywords .= $keyword[$i].', ';
	}
}
$spam[score] += $spam_level;

Thank you for any help!

Merlin

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux