On 11/10/11 12:48, John R Pierce wrote: > On 10/10/11 7:44 PM, Craig Ringer wrote: >> If blocking writes causes a server failure that persists once writes >> have been unblocked, that's a bug IMO. You might have a bit of a backlog >> of writes to clear, but after that all should be well, and if it isn't >> then something needs fixing. > > the process is blocked waiting for this disk write to complete, > meanwhile, the packets are queuing up and waiting for service. > > best of luck with all that.... xfs_freeze for long enough to take a snapshot doesn't take long, or it shouldn't, anyway. Even if it did, that shouldn't cause a server failure that persists past when disk I/O is resumed, though it might cause individual connections to drop. I can `kill -STOP' Pg, or unplug my network cable for several seconds and expect everything to resume just fine when I `kill -CONT' or plug back in. Packets will be buffered by the OS if Pg is busy or by the closest router if the network is unplugged, and will be delivered when it becomes responsive again. If that takes too long or if too many packets arrive, packets will be dropped, in which case TCP/IP will re-send them. If the outage is protracted enough the client might eventually decide the peer has gone away and drop the connection, but even then new connections should be established to the server just fine once it resumes responding. It is totally unreasonable for Pg to *stay* nonfunctional once disk I/O resumes. Existing connections should receive responses they're waiting on or die, depending on how long it's been, and new connections should be accepted fine. -- Craig Ringer -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general