* W Luke <wtluke@xxxxxxxxx>: > I've been fascinated by Flickr's, del.icio.us and other sites' usage > of these Weighted Lists. It's simple but effective and I really want > to use it for a project I'm doing. > > So I had a look at Nick Olejniczak's plugin for Wordpress (available > here: www.nicholasjon.com) but am struggling to understand the logic > behind it. > > What I need is to dump all words (taken from the DB) from just one > column into an array. Filter out common words > (the,a,it,at,you,me,he,she etc), then calculate most frequent words to > provide the weighted list. Has anyone attempted this? Funny you should mention this -- I'm working on something like this right now for work. Basically, you need to: * define a list of common words to skip * define weighting (I weight items in a title and in text differently, for instance -- usually you weight by which field you're using); store weighting in an associative array * define a weights array (associative array of word => score) * separate all text from the column into words (build a words array) * loop over the words array * skip if the word is a common word * increment word element in weights array by the weight The sticky issues are: what is a word (you'll need to build a regexp for that), and how will you weight words (usually by field). Once you have all this, you populate a database table for use as a reverse lookup. For a good example of how to do this (in perl), see: http://www.perl.com/lpt/a/2003/09/25/searching.html -- Matthew Weier O'Phinney | WEBSITES: Webmaster and IT Specialist | http://www.garden.org National Gardening Association | http://www.kidsgardening.com 802-863-5251 x156 | http://nationalgardenmonth.org mailto:matthew@xxxxxxxxxx | http://vermontbotanical.org -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php