Hello!
I was trying to find solution for my problem for a long long time, but
I had no luck. Everything started when I installed Gentoo with Squid on
64 bit architecture. It is dual Xeon 3.4 with 2 GB of RAM. My CFLAGS are
"-O3 -march=nocona -pipe -fomit-frame-pointer". It didn't happen on 32
bit architecture.
Problem is that few hours after starting squid it starts to issue
segmentation faults. It is snippet of my logs:
Dec 9 13:00:58 faramir squid[18109]: segfault at 00000001015cc190 rip
00002aaaab798b16 rsp 00007fffffc3c4f0 error 4
Dec 9 13:25:46 faramir squid[18315]: segfault at 0000000100d161f0 rip
00002aaaab798b16 rsp 00007fffffd8f2f0 error 4
Dec 9 13:47:45 faramir squid[18601]: segfault at 0000000101ad26b8 rip
00002aaaab7995a1 rsp 00007ffffff5bab0 error 4
Dec 9 14:11:10 faramir squid[18856]: segfault at 0000000100bd8000 rip
00002aaaab798b16 rsp 00007fffffd3f370 error 4
Dec 9 14:34:50 faramir squid[19151]: segfault at 0000000101701d80 rip
00002aaaab798b16 rsp 00007fffff962460 error 4
Dec 9 14:55:35 faramir squid[19417]: segfault at 00000001019c5930 rip
00002aaaab798b16 rsp 00007fffff80a8e0 error 4
Dec 9 15:22:28 faramir squid[19682]: segfault at 0000000100000010 rip
00002aaaab798b16 rsp 00007fffff8a60b0 error 4
Dec 9 15:43:24 faramir squid[20001]: segfault at 0000000101ade030 rip
00002aaaab798b16 rsp 00007fffffde07e0 error 4
Dec 9 16:07:08 faramir squid[20252]: segfault at 00000001018a1ec0 rip
00002aaaab798b16 rsp 00007fffff80b170 error 4
Dec 9 16:37:23 faramir squid[20536]: segfault at 0000000100000010 rip
00002aaaab798b16 rsp 00007fffffea0800 error 4
Dec 9 17:03:36 faramir squid[20878]: segfault at 000000010aacfd10 rip
00002aaaab798b16 rsp 00007fffff83a370 error 4
Dec 9 17:30:09 faramir squid[21189]: segfault at 00000001013c3590 rip
00002aaaab798b16 rsp 00007fffffa677f0 error 4
Dec 9 17:57:08 faramir squid[21516]: segfault at 0000000100d524c0 rip
00002aaaab798b16 rsp 00007fffffb5d420 error 4
Dec 9 18:24:16 faramir squid[21816]: segfault at 00000001014d0bf0 rip
00002aaaab798b16 rsp 00007fffffdf3b60 error 4
Dec 9 18:49:24 faramir squid[22131]: segfault at 0000000101390710 rip
00002aaaab798b16 rsp 00007ffffff6f960 error 4
Dec 9 19:16:18 faramir squid[22420]: segfault at 0000000100000010 rip
00002aaaab798b16 rsp 00007fffff8ccdf0 error 4
Dec 9 19:45:15 faramir squid[22791]: segfault at 0000000101357630 rip
00002aaaab798b16 rsp 00007fffff9286a0 error 4
Dec 9 20:11:12 faramir squid[23136]: segfault at 00000001014d8610 rip
00002aaaab798b16 rsp 00007ffffff0ad60 error 4
Dec 9 20:22:43 faramir squid[23494]: segfault at 000000010123d540 rip
00002aaaab798b16 rsp 00007fffffa35700 error 4
Dec 9 20:48:16 faramir squid[23648]: segfault at 0000000101dea870 rip
00002aaaab798b16 rsp 00007fffffeb17a0 error 4
Dec 9 21:07:01 faramir squid[23939]: segfault at 00000001016aef40 rip
00002aaaab798b16 rsp 00007fffffd6aab0 error 4
Dec 9 21:41:41 faramir squid[24184]: segfault at 00000001015a8570 rip
00002aaaab798b16 rsp 00007fffffbd03a0 error 4
Dec 9 22:14:04 faramir squid[24580]: segfault at 000000010f49d1e0 rip
00002aaaab798b16 rsp 00007fffffb96fc0 error 4
Dec 9 22:29:50 faramir squid[24943]: segfault at 000000010157a410 rip
00002aaaab798b16 rsp 00007fffffd6e990 error 4
Dec 9 22:43:39 faramir squid[25135]: segfault at 0000000100f01870 rip
00002aaaab798b16 rsp 00007fffffb0d630 error 4
Dec 9 23:11:11 faramir squid[25330]: segfault at 00000001020292d0 rip
00002aaaab798b16 rsp 00007fffffedadc0 error 4
Dec 9 23:23:45 faramir squid[25663]: segfault at 0000000101323b80 rip
00002aaaab798b16 rsp 00007fffffb9c0c0 error 4
Dec 9 23:42:52 faramir squid[25821]: segfault at 00000001017e2c90 rip
00002aaaab798b16 rsp 00007fffffe44c00 error 4
And so on. Always it is error 4, "at address" changes and rip is almost
same - 00002aaaab798b16 or 00002aaaab7995a1 in last week. I went trough
kernel sources and found that error 4 means that segfault was caused by
user space app when trying to read page that wasn't found.
I am using squid-2.5.STABLE12. It happens also with STABLE10 and
STABLE11. Funny thing is that squid is still runing after segfault. When
it occurs I receive "Connection refused" from squid daemon, and then it
works fine. But those errors are annoying my customers.
At http://alchemyx.uznam.net.pl/squid/ you can find my mrtg graphs from
data collected via SNMP. They are really strange.
I would appreciate any help, maybe somebody had same issue, or some
kind of Squid-debugging-HOWTO. I can try that also.
Thank you in advance
and Merry Christmas (soon!)
--
Michał Margula, alchemyx@xxxxxxxxxxxx, http://alchemyx.uznam.net.pl/
"W życiu piękne są tylko chwile" [Ryszard Riedel]