Thanks for these suggestions. I am beginning to wonder if the issue is deeper. I set autovacuum to off, then turned off all the connections to the database, and did a manual vacuum just to see how long it takes. This was last night my time. I woke up this morning and it has still not finished. The maintenance_men given to the DB for this process was 2GB. There is nothing else going on on the server! Now, even REINDEX is just failing in the middle: # REINDEX INDEX new_idx_userid; server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Failed. What else could be wrong? On Mon, Apr 18, 2011 at 2:38 AM, Scott Marlowe <scott.marlowe@xxxxxxxxx> wrote: > On Sun, Apr 17, 2011 at 10:59 AM, Phoenix <phoenix.kiula@xxxxxxxxx> wrote: >> TOP does not show much beyond "postmaster". How should I use TOP and >> what info can I give you? This is what it looks like: > > We're basically looking to see if the postmaster process doing the > vacuuming or reindexing is stuck in a D state, which means it's > waiting on IO. > hot the c key while it's running and you should get a little more info > on which processes are what. > >> 4799 postgres 15 0 532m 94m 93m D 0.7 1.2 0:00.14 >> postmaster > > That is likely the postmaster that is waiting on IO. > >> VMSTAT 10 shows this: >> >> r b swpd free buff cache si so bi bo in cs us sy id wa >> 3 14 99552 17900 41108 7201712 0 0 42 11 0 0 8 34 41 16 >> 2 17 99552 16468 41628 7203012 0 0 1326 84 1437 154810 7 66 12 15 >> 3 7 99476 16796 41056 7198976 0 0 1398 96 1453 156211 7 66 21 6 >> 3 17 99476 17228 39132 7177240 0 0 1325 68 1529 156111 8 65 16 11 > > So, we're at 11 to 15% io wait. I'm gonna guess you have 8 cores / > threads in your CPUs, and 1/8th ot 100% is 12% so looks like you're > probably IO bound here. iostat tells us more: > >> The results of "iostat -xd 10" is: >> Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s >> avgrq-sz avgqu-sz await svctm %util >> sda 0.00 7.41 0.30 3.50 2.40 87.29 1.20 43.64 >> 23.58 0.13 32.92 10.03 3.81 >> sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 >> 0.00 0.00 0.00 0.00 0.00 >> sdc 0.00 18.32 158.26 4.10 2519.32 180.98 1259.66 >> 90.49 16.63 13.04 79.91 6.17 100.11 > > 100% IO utilization, so yea, it's likely that your sdc drive is your > bottleneck. Given our little data is actually moving through the sdc > drive, it's not very fast. > >> Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s > >> 8GB memory in total. 1GB devoted to PGSQL during these operations. >> Otherwise, my settings are as follows (and yes I did make the vacuum >> settings more aggressive based on your email, which has had no >> apparent impact) -- > > Yeah, as it gets more aggressive it can use more of your IO bandwidth. > Since you > >> What else can I share? > > That's a lot of help. I'm assuming you're running software or > motherboard fake-raid on this RAID-1 set? I'd suggest buying a $500 > or so battery backed caching RAID controller first, the improvements > in performance are huge with such a card. You might wanna try testing > the current RAID-1 set with bonnie++ to get an idea of how fast it is. > -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance