Re: Re: keywords generation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



use a simple:
asort($total, SORT_NUMERIC);
$total will hold the correct array with the most common word at the top, and the least common at the bottom. (just a quick question, why're you using latin? :P)


> Hi,
>
> no problem at all...
>
> well, the script is incomplete cause I don´t know how to sort the
> $total array by value... for example:
> Array
> (
>      [Lorem] => 1
>      [ipsum] => 1
>      [dolor] => 5
>      [sit] => 1
>      [met] => 1
>      [consectetuer] => 1
>      [dipiscing] => 1
>      [elit] => 2
>      [sed] => 1
>      [dim] => 0
>      [nonummy] => 1
>      [nibh] => 1
>      [euismod] => 1
>      [tincidunt] => 1
>      [ut] => 5
>      [loreet] => 0
>      [dolore] => 3
>      [mgn] => 0
>      [liqum] => 0
>      [ert] => 0
>      [volutpt] => 0
>      [Ut] => 1
>      [wisi] => 1
>      [enim] => 1
>      [d] => 19
>      [minim] => 1
>      [venim] => 0
>      [quis] => 1
>      [nostrud] => 1
>      [exerci] => 1
>      [ttion] => 0
>      [ullmcorper] => 0
>      [suscipit] => 1
>      [lobortis] => 1
>      [nisl] => 1
>      [liquip] => 1
>      [ex] => 2
>      [e] => 49
>      [commodo] => 1
>      [consequt] => 0
>      [Duis] => 1
>      [utem] => 1
>      [vel] => 3
>      [eum] => 1
>      [iriure] => 1
>      [in] => 5
>      [hendrerit] => 1
>      [vulputte] => 0
>      [velit] => 1
>      [esse] => 1
>      [molestie] => 1
>      [illum] => 1
>      [eu] => 5
>      [feugit] => 0
>      [null] => 2
>      [fcilisis] => 0
>      [t] => 39
>      [vero] => 1
>      [eros] => 1
>      [et] => 5
>      [ccumsn] => 0
>      [ius] => 1
>      [odio] => 1
>      [dignissim] => 1
>      [qui] => 3
>      [blndit] => 0
>      [present] => 0
>      [lupttum] => 0
>      [zzril] => 1
>      [delenit] => 1
>      [ugue] => 1
>      [duis] => 1
>      [te] => 4
>      [fcilisi] => 0
> )
>
> Which is the word and it total occurrence in the text... Now I want to
> sort it from the highest values to the lowest... and then return a
> keyword string as:
>
> $keywords = 'dolor,t,eu,te,in';
>
> Got it?
>
> Thanks,
> Bruno B B Magalhaes
>
> On Nov 5, 2004, at 7:42 PM, M. Sokolewicz wrote:
>
>> Bruno b b magalhães wrote:
>>
>>> Hi People,
>>> well, I am building a very sophisticated(?) CMS, and I am thinking to
>>> implement a keyword automatically generation function... I thought on
>>> the following structure:
>>> ==================
>>> $submited_text = 'Lorem ipsum dolor sit amet, consectetuer adipiscing
>>> elit, sed diam nonummy nibh euismod tincidunt ut laoreet dolore magna
>>> aliquam erat volutpat. Ut wisi enim ad minim veniam, quis nostrud
>>> exerci tation ullamcorper suscipit lobortis nisl ut aliquip ex ea
>>> commodo consequat. Duis autem vel eum iriure dolor in hendrerit in
>>> vulputate velit esse molestie consequat, vel illum dolore eu feugiat
>>> nulla facilisis at vero eros et accumsan et iusto odio dignissim qui
>>> blandit praesent luptatum zzril delenit augue duis dolore te feugait
>>> nulla facilisi. ';
>>> generate_keywords($submited_text);
>>> function generate_keywords($text)
>>> {
>>>     if(isset($text) && $text != '')
>>>     {
>>>         $words_to_ignore = array('/a/',
>>>                                 '/to/',
>>>                                 '/of/',
>>>                                 '/from/'
>>>                                 );
>>>         $words =
>>> str_word_count(preg_replace($words_to_ignore,'',$text),1);
>>>         foreach($words as $var=>$val)
>>>         {
>>>             $total[$val] = substr_count($text,$val);
>>>             }
>>>            }
>>> }
>>> =================
>>> How can I sort the resulting array by value, without loosing its
>>> relations.
>>> Is there a faster way of doing this?
>>> Regards,
>>> Bruno B B Magalhaes
>> Yes!!! DO NOT USE REGEXPS WHEN YOU DON'T NEED THEM! :) (not to be
>> rude, but you *really* don't need them. You're doing SIMPLE
>> str-replacements, which goes at LEAST a factor 20 faster using
>> str_replace thant using *any* regexp function.)
>>
>> Just remember the following:
>> from fastest to slowest:
>> str_replace
>> str_ireplace
>> preg_replace
>> ereg_replace
>> eregi_replace
>>
>> If you're not doing any regexp magic, use str_replace (or str_ireplace
>> as of PHP 5). As quotes from the str_replace section of the PHP
>> manual:
>> [snip]If you don't need fancy replacing rules (like regular
>> expressions), you should always use this function instead of
>> ereg_replace() or preg_replace().[/snip].
>>
>> Also, what does:
>> [snip]$words =
>> str_word_count(preg_replace($words_to_ignore,'',$text),1);[/snip] do.
>> The result is not used after storing it, nor is it returned in any
>> way...?
>>
>> Now, at the end you have the $total array, and you discard it at the
>> end... how... useful? (void return)
>>
>> hope those remarks help you, and if you consider me rude, just blame
>> it on a very early shift I had today
>>
>> --
>> PHP General Mailing List (http://www.php.net/)
>> To unsubscribe, visit: http://www.php.net/unsub.php
>>
>>

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php


[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux