Rich Bowen wrote: > These are optional fields which *may* be passed by a user agent. When they > are passed, they are not reliable - that is, they may be spoofed, trivially. Understood. I'm not depending on them for any decision-making. The issue is that Analog discards those lines, so (for example) requests logged for a particular file (which are missing those two fields) are dis- carded and not counted for purpose of things like "top 25 requested files". Also, they're completely absent, despite the escaped "s in the LogFormat directive which should generate either "" "" or "-" "-" when the fields are missing. > It would be interesting to see what version of what browser released in the > last 30 days. Most of the clients accessing the site in question are using ancient browsers - in one case where I investigated fully, the client PC is running Windows 2000 and IE 6. Some of its accesses had the Referer and User-Agent logged, while others had them missing. One system where I have logs going back 2+ years shows a number of entries with missing fields at a reasonably constant rate (200 to 5000 per month), with no big jump. Oddly, that's the system where I'd expect new client ver- sions (like Firefox 5) to show up, yet the number of logged lines where the fields are missing remains relatively constant. It seems that either both fields are properly present, or both are missing. I was unable to locate any log lines which had either a Referer or "-" but which were missing the User-Agent field. > Oh. Hmm. That's interesting. What I would look for, in that case, is more > than one LogFormat directive logging to the same location. I thought of that and checked it previously. However, I just checked it again (Apache 2.0.63 system): (0:12) www:/usr/local/etc/apache2# grep CustomLog * httpd.conf:# a CustomLog directive (see below). httpd.conf:#CustomLog /var/log/httpd-access.log common httpd.conf:#CustomLog /var/log/httpd-referer.log referer httpd.conf:#CustomLog /var/log/httpd-agent.log agent httpd.conf:CustomLog /var/log/httpd-access.log combined httpd.conf:# CustomLog /var/log/dummy-host.example.com-access_log common httpd.conf:CustomLog /var/log/httpd-deflate.log deflate ssl.conf:CustomLog /var/log/httpd-ssl_request.log \ ssl.conf_orig:CustomLog /var/log/httpd-ssl_request.log \ I only see 3 uncommented CustomLog directives, one for a combined log, a separate one that logs deflate info, and a third one for SSL requests. There also isn't any discernable pattern to the entries with the missing fields - some CGI requests are logged with them, some without. Same for PHP. Some are for 404's, some are for successful file access. I'm baffled. I wonder if anyone else is having the same issue, but didn't notice it. For example, Analog will only complain about "Large number of corrupt lines in logfile" if they exceed a certain percentage threshold of the total number of lines in the log file. The following (disgusting, I really should use awk) command string should report the total number of lines missing the Referer and User-Agent fields in a combined-format logfile, at least if the default timestamp format is used: cut -d \] -f 2-99 /var/log/httpd-access.log | cut -d \" -f 3-99 | cut -d " " -f 4-99 | grep ^$ | wc -l Anybody want to try it? (Of course, satisfy yourself that it can't do anything evil first). On two of my production systems running 2.0.63: (0:23) www:/tmp# cut -d \] -f 2-99 /var/log/httpd-access.log | cut -d \" -f 3-99 | cut -d " " -f 4-99 | grep ^$ | wc -l 743308 (0:24) www:/tmp# wc -l /var/log/httpd-access.log 4802394 /var/log/httpd-access.log (0:175) gate:/tmp# cut -d \] -f 2-99 /var/log/httpd-access.log | cut -d \" -f 3-99 | cut -d " " -f 4-99 | grep ^$ | wc -l 99583 (0:176) gate:/tmp# wc -l /var/log/httpd-access.log 3658733 /var/log/httpd-access.log On a 2.2.19 test system I just brought up: (0:36) test:/tmp# cut -d \] -f 2-99 /var/log/httpd-access.log | cut -d \" -f 3-99 | cut -d " " -f 4-99 | grep ^$ | wc -l 433 (0:37) test:/tmp# wc -l /var/log/httpd-access.log 1321 /var/log/httpd-access.log The test system is particularly interesting as I did NOT copy the Apache configuration files from a production system - I configured it by editing the default config files. So this shouldn't be a cut-and-paste error. Terry Kennedy http://www.tmk.com terry@xxxxxxx New York, NY USA --------------------------------------------------------------------- The official User-To-User support forum of the Apache HTTP Server Project. See <URL:http://httpd.apache.org/userslist.html> for more info. To unsubscribe, e-mail: users-unsubscribe@xxxxxxxxxxxxxxxx " from the digest: users-digest-unsubscribe@xxxxxxxxxxxxxxxx For additional commands, e-mail: users-help@xxxxxxxxxxxxxxxx