Thanks Scott. I have shared huge amounts of info in my emails to Merlin and you. Intentionally not shared in public. Apologies if you are feeling tired. The reason I need to REINDEX is because a simple SELECT query based on the index column is taking ages. It used to take less than a second. I want to make sure that the index is properly in place, at least. We went through some BLOAT reports. Apparently Merlin told me there's no significant bloat. A manual VACUUM right now takes ages too. AUTOVACUUM settings are below. It's a RAID 1 setup. Two Raptor 10000rpm disks. TOP does not show much beyond "postmaster". How should I use TOP and what info can I give you? This is what it looks like: 14231 root 18 0 4028 872 728 R 93.8 0.0 28915:37 exim_dbmbuild 11001 root 25 0 4056 864 716 R 93.8 0.0 23111:06 exim_dbmbuild 16400 root 25 0 4824 864 720 R 92.5 0.0 33843:52 exim_dbmbuild 4799 postgres 15 0 532m 94m 93m D 0.7 1.2 0:00.14 postmaster 12292 nobody 15 0 48020 14m 5088 S 0.7 0.2 0:00.06 httpd 12943 root 17 0 2828 1224 776 R 0.7 0.0 0:00.04 top 7236 mysql 16 0 224m 64m 3692 S 0.3 0.8 26:43.46 mysqld 31421 postgres 15 0 530m 12m 12m S 0.3 0.2 0:03.08 postmaster 31430 postgres 15 0 10456 576 224 S 0.3 0.0 0:00.08 postmaster 955 postgres 15 0 532m 91m 90m S 0.3 1.1 0:00.15 postmaster 1054 postgres 15 0 532m 196m 195m S 0.3 2.4 0:00.37 postmaster 1232 postgres 15 0 532m 99m 98m D 0.3 1.2 0:00.27 postmaster 1459 postgres 15 0 532m 86m 85m S 0.3 1.1 0:00.12 postmaster 4552 postgres 15 0 532m 86m 85m S 0.3 1.1 0:00.08 postmaster 7187 postgres 15 0 532m 157m 155m S 0.3 1.9 0:00.19 postmaster 7587 postgres 15 0 532m 175m 173m D 0.3 2.2 0:00.23 postmaster 8131 postgres 15 0 532m 154m 152m S 0.3 1.9 0:00.15 postmaster 9473 nobody 16 0 48268 15m 5800 S 0.3 0.2 0:00.34 httpd 9474 nobody 15 0 48096 14m 5472 S 0.3 0.2 0:00.27 httpd 10688 nobody 16 0 0 0 0 Z 0.3 0.0 0:00.20 httpd <defunct> 12261 nobody 15 0 47956 13m 4296 S 0.3 0.2 0:00.08 httpd 12278 nobody 15 0 47956 13m 4052 S 0.3 0.2 0:00.04 httpd 12291 nobody 15 0 47972 14m 4956 S 0.3 0.2 0:00.07 httpd 12673 nobody 15 0 47912 13m 4180 S 0.3 0.2 0:00.02 httpd 12674 nobody 15 0 47936 13m 4924 S 0.3 0.2 0:00.02 httpd 12678 nobody 16 0 47912 13m 4060 S 0.3 0.2 0:00.01 httpd 12727 nobody 15 0 47912 13m 4024 S 0.3 0.2 0:00.03 httpd 12735 nobody 15 0 47912 13m 4144 S 0.3 0.2 0:00.02 httpd VMSTAT 10 shows this: r b swpd free buff cache si so bi bo in cs us sy id wa 3 14 99552 17900 41108 7201712 0 0 42 11 0 0 8 34 41 16 2 17 99552 16468 41628 7203012 0 0 1326 84 1437 154810 7 66 12 15 3 7 99476 16796 41056 7198976 0 0 1398 96 1453 156211 7 66 21 6 3 17 99476 17228 39132 7177240 0 0 1325 68 1529 156111 8 65 16 11 The results of "iostat -xd 10" is: Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util sda 0.24 24.55 9.33 4.41 111.31 231.75 55.65 115.88 24.97 0.17 12.09 6.67 9.17 sdb 0.06 97.65 2.21 3.97 91.59 389.58 45.80 194.79 77.84 0.06 9.95 2.73 1.69 sdc 1.46 62.71 187.20 29.13 132.43 311.72 66.22 155.86 2.05 0.36 1.65 1.12 24.33 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util sda 0.00 7.41 0.30 3.50 2.40 87.29 1.20 43.64 23.58 0.13 32.92 10.03 3.81 sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdc 0.00 18.32 158.26 4.10 2519.32 180.98 1259.66 90.49 16.63 13.04 79.91 6.17 100.11 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util sda 0.00 6.21 0.00 1.40 0.00 60.86 0.00 30.43 43.43 0.03 20.07 15.00 2.10 sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdc 0.10 10.31 159.06 2.50 2635.44 101.70 1317.72 50.85 16.94 12.82 79.44 6.20 100.12 8GB memory in total. 1GB devoted to PGSQL during these operations. Otherwise, my settings are as follows (and yes I did make the vacuum settings more aggressive based on your email, which has had no apparent impact) -- max_connections = 350 shared_buffers = 500MB effective_cache_size = 1250MB max_fsm_relations = 1500 max_fsm_pages = 950000 work_mem = 100MB maintenance_work_mem = 200MB temp_buffers = 4096 authentication_timeout = 10s ssl = off checkpoint_warning = 3600 random_page_cost = 1 What else can I share? Thanks much for offering to help. On Sun, Apr 17, 2011 at 11:44 PM, Scott Marlowe <scott.marlowe@xxxxxxxxx> wrote: > On Sun, Apr 17, 2011 at 9:30 AM, Phoenix Kiula <phoenix.kiula@xxxxxxxxx> wrote: >> Sorry, rejuvenating a thread that was basically unanswered. >> >> I closed the database for any kinds of access to focus on maintenance >> operations, killed all earlier processes so that my maintenance is the >> only stuff going on. >> >> REINDEX is still taking 3 hours -- and it is still not finished! >> >> Similarly, if I cancel the REINDEX and issue a VACUUM ANALYZE VERBOSE, >> this too seems to just hang there on my big table. >> >> I changed the maintenance_work_men to 2GB for this operation. It's >> highly worrisome -- the above slow times are with 2GB of my server >> dedicated to Postgresql!!!! >> >> Surely this is not tenable for enterprise environments? I am on a >> 64bit RedHat server with dual CPU Intel Woodcrest or whatever that was >> called. Postgres is 8.2.9. >> >> How do DB folks do this with small maintenance windows? This is for a >> very high traffic website so it's beginning to get embarrassing. >> >> Would appreciate any thoughts or pointers. > > Upgrade to something more modern than 8.2.x. Autovacuum was still > very much in its infancy back then. 9.0 or higher is a good choice. > What do iostat -xd 10 and vmstat 10 and top say about these processes > when they're running. "It's taking a really long time and seems like > it's hanging" tells us nothing useful. Your OS has tools to let you > figure out what's bottlenecking your operations, so get familiar with > them and let us know what they tell you. These are all suggestions I > made before which you have now classified as "not answering your > questions" so I'm getting a little tired of helping you when you don't > seem interested in helping yourself. > > What are your vacuum and autovacuum costing values set to? Can you > make vacuum and / or autovacuum more aggresive? > -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance