HI Richard, thanks for your immediate response. I will answer your questions below: Am Dienstag, 8. April 2008 17:40 schrieb Richard Huxton: > mljv@xxxxxxxxxxxx wrote: > > We looked in our cpu monitoring and saw that we have huge IOwait while > > VACUUM is running, not unusual though. But just after VACUUM was > > finished, the userCPU load raised to 200% (dual cpu). At this time in the > > early morning there were only very few users using our app. We analyzed > > the situation further and saw that most queries had a very long duration > > time (minutes instead of milliseconds). BTW: the vacuum process was not > > running at this moment. It was logged before as done. > > But WHAT was using all of your cpu? Was it PostgreSQL, and if so just > one backend? If something else was using all your cpu, then it might > just be that the PostgreSQL server wasn't getting a chance to run your > queries. The CPU was used by postgresql. As i saw many SELECTS by looking at "ps ax" i guess that many backends using all of the cpu. > > * Maybe the Vacuum analyze process has not enough memory and therefore > > could not ANALYZE the tables correctly. It then writes wrong statistics > > to the database which results in wrong execution plans using sequence > > scans instead of index scans. This only happens if the vacuum analyze > > process runs concurrently with user requests. If it runs on its own, the > > vacuum process has enough memory and writes correct statistics to the > > database. > > Doesn't sound likely to me. You've got 8GB of RAM, and if you were going > into swap you'd have noticed the disk I/O. That makes sense. We had not any IOwait at this moment but 200% userCPU. But maybe vaccum can't use the ram because of a mis-configuration. > > Here are some of our configuration parameters. We never really tweaked it > > as it ran fine. We just raised some parameters. The following list should > > show all parameters changed from the default: > > > > max_connections = 300 > > shared_buffers = 30000 > > work_mem = 10240 > > OK, so that's 30,000 * 8KB = 240MB of shared_buffers > You have 10MB of work_mem and if all 300 connections were using that > much you'd have committed 3GB of your RAM for that. Of course they'll > want more than just that. > > Do you really have 300 concurrent connections? At peaktime, yes. > > maintenance_work_mem = 163840 > > 160MB for vacuums - should be OK given how much memory you have and the > fact that it's quiet when you vacuum. > > > vacuum_mem = 32000 > > This is what maintenance_work_mem used to be called. You can delete this > entry. > > > max_fsm_pages = 500000 > > You can track at most 500,000 pages with free space on them. In 8.2+ > versions VACUUM VERBOSE will show you how many are currently being used. > Not sure about 8.1 > > > bgwriter_lru_percent = 10.0 > > bgwriter_lru_maxpages = 100 > > bgwriter_all_percent = 5 > > bgwriter_all_maxpages = 200 > > wal_buffers = 16 > > > > checkpoint_segments = 10 > > If you have bursts of write activity you might want to increase this. > > > checkpoint_warning = 3600 > > > > effective_cache_size = 180000 > > That's 180,000 * 8KB = 180 * 8MB = 1.4GB > If that's really all you're using as cache, I'd reduce the number of > concurrent connections. Check free/top and see how much RAM is really > being used as disk cache. > > > random_page_cost = 3 > > Might be too high - you don't mention what disks you have. 3ware RAID-1 Controller with plain (cheap) SATA disks > The crucial thing is to find out exactly what is happening when things > get very slow. Check vmstat and top, look in the pg_locks system-table > and if needs be we can see what strace says a particular backend is doing. The problem is that we have peaktimes were everything is running fine. It has something to do with the vacuum process running. To simplify my problem: - I run vaccum analyze concurrently with some few user queries: slows down to a crawl after vacuum is finished. - if i run it while no user is connected, everything runs fine afterwards. It has something to do with the vacuum analyze process. kind regards, janning