On Feb 4, 2008 7:42 PM, Jim Lucas <lists@xxxxxxxxx> wrote: > Daniel Brown wrote: > > On Feb 4, 2008 2:48 PM, Jason Pruim <japruim@xxxxxxxxxx> wrote: > >> Hi Everyone! :) > >> > >> Just a quick question, I've done some googling but haven't been able > >> to find what I need... I am looking at doing a search function for > >> someone's website, the website is just static HTML files, and she > >> doesn't want to redo the entire website to make it dynamic. > > > > I got bored, so I wrote out a system to handle it. Let me know if > > you want the source when it's done. > > > > So did I... :) > > Has options for searching recursively, case sensitivity, and displayable HTML > only or the entire file. Also a restriction to limit which filetypes from the > results it will display. > > It only works on *nix, I used grep. Don't have to worry about cron, scheduled > tasks, etc to refresh a DB. I have it searching about 7000 files that range > from plain text 2k all the way up to 60meg binary zip files, and it is rather > quick at it. I was going to do the same thing, but I started the framework from a different perspective: portability across platforms to be used to spider local and remote sites. I can punch in any valid URL and recursively scan all public links, remove all HTML, put plain text into the database, and (when it's actually working a bit better) I'm going to use a Google-style response on search. For example, when terms are found, show snippets of the terms with four or five words on each side of the terms. There are still some things I'd like to add in, should it be of use to someone, including title searching, META searching, and blah, blah, blah, but for now I'm just putting together the boring foundation on which someone else can build the nice-looking house. -- </Dan> Daniel P. Brown Senior Unix Geek <? while(1) { $me = $mind--; sleep(86400); } ?> -- PHP General Mailing List (http://www.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php