Re: SMP-Rock-frequent FATAL: Received Segment Violation...dying."only on kid3"

Eliezer Croitoru <eliezer@xxxxxxxxxxxx> · Tue, 19 Nov 2013 01:18:48 +0200

Hey Dr,

(notes inside)

On 19/11/13 00:54, Amos Jeffries wrote:
Either event may have corrupted it slightly. Squid is supposed to
contain sufficient checksum protection in rock to cope with most forms
of corruption, but nobody's perfect.

So, please try to get a core dump, or stack trace of the problem before
going any further. This will help us to isolate where the problem is
occuring. If it is corruption related we will be needing to try and add
better protection for that case.

*after* that, please try:

* shutting Down your Squid by any means necessary to ensure there are 0
processes running.
"pgrep squid"

* *move* the caches to somewhere they can be analysed later if necessary.

* rebuild the configured cache_dir with squid -z

* wait until -z process completed *AND* there are 0 processes still
running in the background
And no traffic at all on the server.

* restart the main Squid

This entire process should not take more than a minute.

If the problem remains after doing that you will have successfully
eliminated cache corruption as a cause and we go back to needing a
backtrace to figure it out.

The same result\test can be achieved by running the service in "RAM 
only" cache mode.

If all these Dying happens lots of times it means that to reproduce it 
you will need a very small amount of run-time(probably).

I assume that Letting the service run on a "RAM only" mode will allow 
this service to still serve clients more then 24\48 hours smoothly with 
0 problems in cache.log.
In a case that you will still have troubles on a RAM only mode we can be 
more then 90% sure that the cause was related to DISK cache.
I will not run to say that rock was fault at the problem.

If you can attach the related DISKs\partitions details from fstab it can 
might help more.
(feel free to send all the technical data in a PM while it will can take 
time to process and especially core dumps)

Regards,
Eliezer