On Wed, 09 Feb 2005 10:56:51 -0800 (PST), Matthew Weier O'Phinney <matthew@xxxxxxxxxx> wrote: > * W Luke <wtluke@xxxxxxxxx>: > > I've been fascinated by Flickr's, del.icio.us and other sites' usage > > of these Weighted Lists. It's simple but effective and I really want > > to use it for a project I'm doing. > > > > So I had a look at Nick Olejniczak's plugin for Wordpress (available > > here: www.nicholasjon.com) but am struggling to understand the logic > > behind it. > > > > What I need is to dump all words (taken from the DB) from just one > > column into an array. Filter out common words > > (the,a,it,at,you,me,he,she etc), then calculate most frequent words to > > provide the weighted list. Has anyone attempted this? > > Funny you should mention this -- I'm working on something like this > right now for work. > > Basically, you need to: > > * define a list of common words to skip > * define weighting (I weight items in a title and in text differently, > for instance -- usually you weight by which field you're using); store > weighting in an associative array > * define a weights array (associative array of word => score) > * separate all text from the column into words (build a words array) > * loop over the words array > * skip if the word is a common word > * increment word element in weights array by the weight > > The sticky issues are: what is a word (you'll need to build a regexp for > that), and how will you weight words (usually by field). Once you have > all this, you populate a database table for use as a reverse lookup. Thanks Matthew - a similar logic to what I'm using, although mine (so far) is a little clunky. I might ping you later on today if that's ok, to talk a bit more and share notes. Regards, -- Will The Corridor of Uncertainty http://www.cricket.mailliw.com/ - Sanity is a madness put to good use - -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php