On Thu, 9 Apr 2009, Bryan Murphy wrote:
(1) hot spare applies 70 to 75 wal files (~1.1g) in 2 to 3 min period
Yeah, if you ever let this many files queue up you're facing a long recovery time. You really need to get into a position where you're applying WAL files regularly enough that you don't ever fall this far behind.
(2) hot spare pauses for 15 to 20 minutes, during this period pdflush consumes 99% IO (iotop). Dirty (from /proc/meminfo) spikes to ~760mb, remains at that level for the first 10 minutes, and then slowly ticks down to 0 for the second 10 minutes.
What does vmstat say about the bi/bo during this time period? It sounds like the volume of random I/O produced by recovery is just backing up as expected. Some quick math:
15GB RAM * 5% dirty_ratio = 750MB ; there's where your measured 760MB bottleneck is coming from.
750MB / 10 minutes = 1.25MB/s ; that's in the normal range for random writes with a single disk
Therefore my bet is that "vmstat 1" will show bo~=1250 the whole time you're waiting there, with matching figures from the iostat to the database disk during that period.
Basically your options here are: 1) Decrease the maximum possible segment backlog so you can never get this far behind 2) Increase the rate at which random I/O can be flushed to disk by either a) Improving things with a [better] battery-backed controller disk cache b) Stripe across more disks -- * Greg Smith gsmith@xxxxxxxxxxxxx http://www.gregsmith.com Baltimore, MD -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general