Re: recovery is stuck when children are not processing SIGQUIT from previous crash

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2009-09-23 at 10:04 -0400, Tom Lane wrote:
> I'd prefer not to go there, at least not without a demonstration that
> this will solve a bug that's unsolvable otherwise.  If a child is
> really stuck in a state that doesn't accept SIGQUIT, it probably
> won't accept SIGKILL either (eg, uninterruptable disk wait).  Or maybe
> we just have some errant code that is blocking SIGQUIT; but that's
> a garden variety bug IMO, not something that needs major new postmaster
> logic to work around.

strace on the backend processes all showed them waiting at

futex(0x7f1ee5e21c90, FUTEX_WAIT_PRIVATE, 2, NULL

Notably, the first argument was the same for all of them.

I gather that a futex is a Linux kernel thing, which is probably then
used by glibc to implement some pthreads stuff.  Anyone know more?

But yes, using SIGKILL on these processes works without problem.


-- 
Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin

[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux