On Tue, 1 Mar 2005, Martin Marji Cermak wrote:
But I noticed that the access-log on the log server was not complete! During peaktime (when Squid served more than 150 requests/sec), there were lines of access-log lost. To be sure this was the case, I inserted a counter in every access-log line and really, there were gaps in the numbers in the access-log on the log server.
I guess the reason was that syslog logging over the network uses udp and does not bother when it is overloaded.
Yes, this is the primary reason why no Squid developers have found it high priority to add syslog access.log logging.
syslog over UDP is inherently unreliable in that it may loose records due to a number of different reasons
a) Packet loss.
b) Syslog server overload, especially if the syslog server is logging syncronously (see syslog server documentation for details)
c) Network overload causing indirect packet loss.
a) Can be avoided by making sure the network equipment is up to shape and running at 100Mbps full duplex.
b) Can be limited by configuring the syslog server to not sync the logs on every write.
c) Is harder, but should not be a problem for access.log traffic if your network is 100 Mbps full duplex.
So I had to change the logging logic to the following one:
log only error (HTTP status code >= 400) access-log records via syslog
log all access-log locally, but rotate the access-log regularly(so I have the complete records for at least last 10 hours)
Won't Kevin much however, as he had problem with the I/O system on his Squid server being saturated making Squid block when writing the access log.
What helps from this is to have the access.log, cache.log, swap.log (and cache_dir swap logs) on a separate drive not shared by the cache.
Regards Henrik