On Thu, Nov 25, 2010 at 12:27:50AM +0000, Amos Jeffries wrote: > On Wed, 24 Nov 2010 13:26:03 +0000, Declan White <declanw@xxxxxxxxxxxx> > wrote: > > I've got some 'uncaught exception' coredumping squids which are leaving no > > clues about their deaths. > > They are *meant* to be sending an SOS via: > > > > main.cc:1162: std::cerr << "dying from an unhandled exception: " << > > e.what() << std::endl; > > > > but std::cerr isn't the cache_log is it. It's STDERR, aka FD 2. > > > > COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME > > squid 22444 squid 2u VCHR 13,2 0t0 3398 > > /devices/pseudo/mm@0:null > > > > .. which according to lsof has been /dev/nulled, which is odd, as I had it > > redirected to a file when it was started. > > > > Should the fallback exception handler not be using another reporting > > channel? > > > > I also notice that the root parent squid which waits for the child > > eventually disappears, after restarting crashes, making the next crash > > fatal. Is that normal? Does it react badly if it catches a HUP sent by a > > 'pkill -HUP squid' ? > > > > DW > > hmm, how many and what particular processes are running? which particular > sub-process(es) is this happening to? how are you starting squid? etc. etc. > > For background, by default only the master process uses stderr as itself. > All sub-processes have their stderr redirected to cache.log. It looks like it's decided by whether or not you use the -N non-daemonise startup flag. The auth sub processes always have STDERR correctly redirected to cache_log, but without -N, the worker squid in the squid/root-squid pair leaves no STDERR open for itself. I'll get my farm using 'squid -N &' when they next hit a quiet period (and I'm awake). This will also fix my HUP problem, the non-worker root-squid does indeed drop dead on HUP. squid 3.1.9 on Solaris 9 64bit btw. DW > Amos