On Tue, Oct 11, 2011 at 12:04 AM, Craig Ringer <ringerc@xxxxxxxxxxxxx> wrote: > On 11/10/11 12:48, John R Pierce wrote: >> On 10/10/11 7:44 PM, Craig Ringer wrote: >>> If blocking writes causes a server failure that persists once writes >>> have been unblocked, that's a bug IMO. You might have a bit of a backlog >>> of writes to clear, but after that all should be well, and if it isn't >>> then something needs fixing. >> >> the process is blocked waiting for this disk write to complete, >> meanwhile, the packets are queuing up and waiting for service. >> >> best of luck with all that.... > > xfs_freeze for long enough to take a snapshot doesn't take long, or it > shouldn't, anyway. On average, xfs_freeze takes about 2 seconds for us with 8 EBS volumes at 60GB each in a software RAID-0 array. > Even if it did, that shouldn't cause a server failure > that persists past when disk I/O is resumed, though it might cause > individual connections to drop. <DELETED> > It is totally unreasonable for Pg to *stay* nonfunctional once disk I/O > resumes. Existing connections should receive responses they're waiting > on or die, depending on how long it's been, and new connections should > be accepted fine. Exactly. I genuinely expect Postgres to be able to withstand a couple of seconds of blocked disk I/O. Especially since this isn't a heavy duty transaction processing system - it's under load, but not a tremendously high load. During our busier times we average something in the neighborhood of 300-400 transactions per second, which just doesn't seem like that much. As much as I would like Postgres to withstand a 2 second outage, I don't honestly care. I'd just like to figure out whether I'm looking at something that's actually a problem or if I should be looking elsewhere for the problem. -- Sean Laurent Director of Operations StudyBlue, Inc. -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general