On 3/07/2013 3:51 p.m., Hussam Al-Tayeb wrote:
this is squid 3.3.6
[root@LARS squid]# tail -n 30 /var/log/squid/cache.log
2013/07/03 06:34:59| Target number of buckets: 40392
2013/07/03 06:34:59| Using 65536 Store buckets
2013/07/03 06:34:59| Max Mem size: 262144 KB
2013/07/03 06:34:59| Max Swap size: 10240000 KB
2013/07/03 06:34:59| Rebuilding storage in /home/squid (clean log)
2013/07/03 06:34:59| Using Least Load store dir selection
2013/07/03 06:34:59| Set Current Directory to /home/squid
2013/07/03 06:34:59| Loaded Icons.
2013/07/03 06:34:59| HTCP Disabled.
2013/07/03 06:34:59| Squid plugin modules loaded: 0
2013/07/03 06:34:59| Accepting NAT intercepted HTTP Socket connections at
local=0.0.0.0:3128 remote=[::] FD 16 flags=41
2013/07/03 06:34:59| Accepting HTTP Socket connections at local=127.0.0.1:3129
remote=[::] FD 17 flags=9
2013/07/03 06:34:59| Store rebuilding is 1.33% complete
2013/07/03 06:35:00| Done reading /home/squid swaplog (299709 entries)
2013/07/03 06:35:00| Finished rebuilding storage from disk.
2013/07/03 06:35:00| 299709 Entries scanned
2013/07/03 06:35:00| 0 Invalid entries.
2013/07/03 06:35:00| 0 With invalid flags.
2013/07/03 06:35:00| 299709 Objects loaded.
2013/07/03 06:35:00| 0 Objects expired.
2013/07/03 06:35:00| 0 Objects cancelled.
2013/07/03 06:35:00| 0 Duplicate URLs purged.
2013/07/03 06:35:00| 0 Swapfile clashes avoided.
2013/07/03 06:35:00| Took 1.36 seconds (220842.74 objects/sec).
2013/07/03 06:35:00| Beginning Validation Procedure
2013/07/03 06:35:00| 262144 Entries Validated so far.
2013/07/03 06:35:00| Completed Validation Procedure
2013/07/03 06:35:00| Validated 299707 Entries
2013/07/03 06:35:00| store_swap_size = 8047432.00 KB
2013/07/03 06:35:01| storeLateRelease: released 0 objects
[root@LARS squid]# find . -type f | wc -l
299706
this means there are 299706 files in /home/squid including the swap.state file
so a total of 299705 objects on disk
but swap.state thinks there are 299709 files.
Interesting. If you scan swap.state and pull out the "fileno" field it
should map to the directory "L1/L2/file" structure. Can you identify
which objects the overlap is about and what swap.state is saying about them?
I suspect this is related to the MD5 mismatches via shutdown_timeout
side effects. ie, that they got recorded to swap.state as SWAPOUT, but
the I/O sending them to disk got aborted along with Squid before the
file was written by the OS.
Or that could be way off base and it be something very weird instead...
another thing I found.
cat /var/log/squid/cache.log | grep WARNING | grep swapin
2013/06/08 23:35:45| WARNING: 1 swapin MD5 mismatches
2013/06/09 00:28:59| WARNING: 10 swapin MD5 mismatches
2013/06/09 12:20:56| WARNING: 1 swapin MD5 mismatches
2013/06/09 12:25:46| WARNING: 10 swapin MD5 mismatches
2013/06/09 14:40:18| WARNING: 1 swapin MD5 mismatches
2013/06/10 02:31:02| WARNING: 1 swapin MD5 mismatches
2013/06/10 13:00:37| WARNING: 1 swapin MD5 mismatches
2013/06/10 22:59:53| WARNING: 1 swapin MD5 mismatches
2013/06/12 14:41:23| WARNING: 1 swapin MD5 mismatches
2013/06/15 04:37:03| WARNING: 1 swapin MD5 mismatches
2013/06/28 07:07:33| WARNING: 1 swapin MD5 mismatches
2013/06/28 21:51:19| WARNING: 1 swapin MD5 mismatches
2013/06/29 23:41:49| WARNING: 1 swapin MD5 mismatches
2013/06/30 04:14:24| WARNING: 10 swapin MD5 mismatches
2013/06/30 19:10:49| WARNING: 1 swapin MD5 mismatches
2013/06/30 19:10:49| WARNING: 10 swapin MD5 mismatches
2013/07/02 13:44:11| WARNING: 1 swapin MD5 mismatches
2013/07/02 13:48:35| WARNING: 10 swapin MD5 mismatches
2013/07/02 23:27:32| WARNING: 100 swapin MD5 mismatches
2013/07/03 06:02:18| WARNING: 1 swapin MD5 mismatches
2013/07/03 06:20:43| WARNING: 1 swapin MD5 mismatches
2013/07/03 06:26:59| WARNING: 10 swapin MD5 mismatches
1) Any of the above is something to worry about?
MD5 mismatches are. It means corrupt cache content. The usual causes (in
order of commonality) are:
* shutdown_timeout set too short - this timeout is designed to allow
client traffic to complete and close cleanly. Aborting Squid earlier
corrupts any incomplete I/O. This is bad as it affects clients live
traffic, not just cache files (which can recover easily as seen above).
* disk failure - the usual bit-loss as disks die slowly. The large count
of failures showing up is a worry if that is the cause.
* somebody playing with the cache content directly.
* I/O errors when saving content to cache. Rare during normal operation
but they do occur (IME usually when RAID is slowing the disk I/O down!)
2) Does squid resolve the file mismatch eventually as I reach the max size of
cache dir?
If Squid thinks there are more files on disk than actually there, then I
would not expect there to be any problem. If anything Squid would be
*under* capacity by the size of the phantom files. The cache size
maintenance should still work normally. There may be some form of
self-correction if the phantom files get evicted from the index.
3) the swapin MD5 mismatches problem. Is it something I can fix? If so, how?
see above for causes. Anything you can do to determine which cause is
happening and resolve it.
Any other information I can post to help detect where the problem is?
Unfortunately the debug which shows what files are failing the checksum
has been disabled for unknown reasons. So not really.
Amos