Jay Paulson (CE CEN) wrote: > Hello everyone! I've been given the responsiblity of coding an apache access_log parser. What my tasks are to do is to return the number of hits for certain file extensions that happen on certain dates with specific IP address. > > As of now I'm only going back 7 days in the log looking for this information and I'm only looking for 5 file types (.doc, .pdf, .html, .php, and .flv). I'm using the fgets() function so I can read the file line by line and do the matches that I need to do and increment the counters as needed. Right now I have 3 loops looking for everything, which seems to me not to be the best way of doing this. I've also encountered that a line may have the file extension I want but it's actually the soucre of another file. (see below for example) > > Log file example: > I want the first line but not the second line. The second line has a .css file which was used by the .html file therefore I don't want this line. I do want the first line that all it has is .html and no other files. > > 10.25.40.64 - - [01/Jan/2006:07:33:18 -0600] "GET /home.html HTTP/1.1" 200 8220 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)" > 10.25.40.64 - - [01/Jan/2006:07:33:18 -0600] "GET /styles/redesign.css HTTP/1.1" 200 2381 "http://wfmu.wfm.pvt/home.html" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1)" > > At any rate, here's some of my psudo code/code for what I'm trying to accomplish. I know there has to be a better way for this and I'm looking for suggestions! <snip> Save yourself a ton of work. Dump the raw logs into a db, and you can do all the queries on the db. Something like this... I took your idea and did a search on Google and found that this has already been done for me! Check it out! http://www.php-scripts.com/php_diary/012103.php3 Very cool :) jay