Re: Find/count different word in a text

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 25 Apr 2012, at 17:59, Karl-Arne Gjersøyen wrote:

> I am looking for a way to find and count different word in a text..
> I am thinking on save the text as array and iterate through the array.
> Then I like to register all different words in the text.
> For example. If bread is used more than one time, I will not count it.
> After the check were I have found single (one) word of every words, I
> count it and tell how many different word that is in the text.
> 
> Is this difficult to do? Can you give me a hint for were I shall look
> on www.php.net for a solution to fix this problem? What function am I
> in need of?

Let's start with... have you had a go? We're not here to do it for you.

Some things to consider...

How big is the text you're wanting to process?

If it's relatively small you can load it up, strip non-alphanumeric characters, use preg_split to get an array of the individual words, then give that to array_unique or array_count_values to get an array of the unique words.

If it's large enough that it will take a sizeable chunk of memory to load and process the entire thing then you'll be better off loading it in chunks, and processing it as you work through the file. Open the file and start reading it in chunks. For each chunk do the same as above but then stuff the results into another array such that the key is the word, then you can simply count the number of entries in the array.

Have a go and if you have problems post the code and we can help you some more.

-Stuart

-- 
Stuart Dallas
3ft9 Ltd
http://3ft9.com/

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php




[Index of Archives]     [PHP Home]     [Apache Users]     [PHP on Windows]     [Kernel Newbies]     [PHP Install]     [PHP Classes]     [Pear]     [Postgresql]     [Postgresql PHP]     [PHP on Windows]     [PHP Database Programming]     [PHP SOAP]

  Powered by Linux