Re: looking for a PHP texte indexer

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




Mihamina Rakotomandimby <mihamina@xxxxxxxxx> hat am 11. Juni 2012 um 11:12
geschrieben:

> Hi all,
>
> I have a small job ad website, where some poster tend to flood with the
> same ad, just in order to be on top of the recent sort.
>
> To perturb the strict duplication detection (yes it's weak), they add
> one or two words that makes difference.
>
> The result is a duplication of many ads.
>
> I would like to search for duplicates by looking for ads with 80%-90%
> same words and decide they're the same, so that I can group them.
>
> Of course, putting a limiting mecanism or even a moderation is
> scheduled, but I want to process existing first.
>
> I dont want to use MySQL for indexing, I believe text indexers are best
> tools for this: Am I wrong?
>
> What would you suggest me to process and lookup for duplicates in that
> situation?

Maybe take a look at

http://de.php.net/manual/de/function.similar-text.php
http://de.php.net/manual/de/function.levenshtein.php


>
> --
> RMA.
>
> --
> PHP General Mailing List (http://www.php.net/)
> To unsubscribe, visit: http://www.php.net/unsub.php
>
Marco Behnke
Dipl. Informatiker (FH), SAE Audio Engineer Diploma
Zend Certified Engineer PHP 5.3

Tel.: 0174 / 9722336
e-Mail: marco@xxxxxxxxxx

Softwaretechnik Behnke
Heinrich-Heine-Str. 7D
21218 Seevetal

http://www.behnke.biz

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux