On Feb 4, 2008 3:40 PM, Shawn McKenzie <nospam@xxxxxxxxxxxxx> wrote: > strip_tags() perhaps? Perhaps; I've never been thrilled with strip_tags(), but it should work well enough here. But combined with grep? I guess for most searches grep would narrow things down reasonably well before you have to start processing files in PHP. It would definitely only be useful for a small site (as you suggested). Identifying keywords wouldn't be all that difficult using the OP's method either. The script could easily count the number of occurrences of each word and create an index with the word, the URL, and the number of occurrences (even excluding a list of noise words if desired) without someone having to manually define a list of keywords. It could be run as often as needed to keep the index up-to-date. However, the thing I like most about using FULLTEXT or something like htdig is that they already provide a good combination of indexing and advanced search operators. Andrew > > Andrew Ballard wrote: > > On Feb 4, 2008 3:13 PM, Shawn McKenzie <nospam@xxxxxxxxxxxxx> wrote: > >> If there aren't many files and you don't intend to grow this site much > >> larger and intend to always have static HTML, any easy implementation > >> would be to read each file and search for the terms either in the > >> keywords tag or in the entire file. > >> > >> Optionally, if you're on a *nix host you could exec() a grep for the > >> terms which returns the matching lines in an array and display as needed. > >> > >> -Shawn > >> > > > > I'm dreading any searches that contain terms like "table", "body", > > "style", "background", etc. These could be perfectly legitimate search > > terms, but without the right filter they would match every document in > > the site rather than just those that contain these terms in the actual > > content rather than the markup. > > > > Andrew > > -- > PHP General Mailing List (http://www.php.net/) > To unsubscribe, visit: http://www.php.net/unsub.php > > -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php