I thought he was extracting the words form the content... maybe just using strip_tags(). Doing that and pushing to a fulltext field would cover most of his bases. Cheers, Rob. On Mon, 2008-02-04 at 14:37 -0600, Shawn McKenzie wrote: > Inefficient, maybe. Lazy, most likely yes. > > I agree that htdig may be a better solution, however his current > solution requires upkeep if the static HTML is changed and requires that > the person populating the database pick all relevant words from the page > and if new ones are added to update the db. > > For example, if you add the entry for the fakeFlowers.html and don't > think it's important to add "long lasting" to the db, even though it > appears on the page, then that search comes up empty. Also, if the site > owner adds a new page or just updates the Flowers.html to include > "roses", then the db needs to be updated for that page or a new record > added for the new page, etc. > > Unless, by FULLTEXT, you're implying that the full text of each page > should be in the db, then I would argue that there is negligible diff > between that and the grep. Then the only major diff is the > maintainability, which the grep wins. > > -Shawn > > Robert Cummings wrote: > > On Mon, 2008-02-04 at 14:13 -0600, Shawn McKenzie wrote: > >> If there aren't many files and you don't intend to grow this site much > >> larger and intend to always have static HTML, any easy implementation > >> would be to read each file and search for the terms either in the > >> keywords tag or in the entire file. > >> > >> Optionally, if you're on a *nix host you could exec() a grep for the > >> terms which returns the matching lines in an array and display as needed. > > > > Wow, that has got to be the most inefficient lazy method I've ever > > heard. I would never suggest such a route on a production server. His > > original plan is much more efficient and is generally along the lines > > how how search indexing works. As such for a simple site I'd do what he > > suggest using a FULLTEXT field in the database, or as Greg Donal > > suggested, use soemthing like htdig. A more involved solution would be > > something like Lucene. Either way, you don't want to be scanning the > > files on ever search request. > > > > Cheers, > > Rob. > -- .------------------------------------------------------------. | InterJinn Application Framework - http://www.interjinn.com | :------------------------------------------------------------: | An application and templating framework for PHP. Boasting | | a powerful, scalable system for accessing system services | | such as forms, properties, sessions, and caches. InterJinn | | also provides an extremely flexible architecture for | | creating re-usable components quickly and easily. | `------------------------------------------------------------' -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php