Re: does wal archiving block the current client connection?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 19 May 2006, Tom Lane wrote:

Well, there's our smoking gun.  IIRC, all the failures you showed us are
consistent with race conditions caused by multiple archiver processes
all trying to do the same tasks concurrently.

Do you frequently stop and restart the postmaster?  Because I don't see
how you could get into this state without having done so.

I've just been looking at the code, and the archiver does commit
hara-kiri when it notices its parent postmaster is dead; but it only
checks that in the outer loop.  Given sufficiently long delays in the
archive_command, that could be a long time after the postmaster died;
and in the meantime, successive executions of the archive_command could
be conflicting with those launched by a later archiver incarnation.

Hurray! Unfortunately, the postmaster on the original troubled server almost never gets restarted, and in fact only has only one archiver process running right now. Drat!

I guess I'll have to try and catch it in the act again the next time the NAS gets wedged so I can debug a little more (it was caught by one of the windows folks last time) and gather some useful data.

Let me know if you want me to test a patch since I've already got this test case setup.

--
Jeff Frost, Owner 	<jeff@xxxxxxxxxxxxxxxxxxxxxx>
Frost Consulting, LLC 	http://www.frostconsultingllc.com/
Phone: 650-780-7908	FAX: 650-649-1954


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux