"Eric B. Ridge" <ebr@xxxxxxxx> wrote: > > Hi! We've got a Postgres 8.1.5 installation with a 60GBish database: > > =# select version(); > version > ------------------------------------------------------------------------ > --------------- > PostgreSQL 8.1.5 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) > 4.1.1 (Gentoo 4.1.1) > (1 row) > > =# select pg_size_pretty(pg_database_size('xxx')); > pg_size_pretty > ---------------- > 59 GB > > (please don't freak over the word Gentoo! This is Postgres compiled, > by us, using the sources from postgresql.org, not some patched-up > Gentoo version) > > Anyways, we run a VACUUM FULL ANALYZE VERBOSE every night, and it's > now taking 3+ hours to complete. We also have autovacuum turned on > with default settings. This 3+ hour vacuum time is cutting into our > production hours. :( Have you established any reason to run vacuum full at all? Generally speaking, vacuum full isn't necessary, and overall isn't a good idea. If you do have a good reason for running it (which I'd be interested to hear) you could reduce the impact by breaking the job up. Perhaps vacuum one table each night, so that if you have 30 tables, each table will get vacuum fulled once a month, and each night's job should only take about 5 minutes. Note that if you _do_ run vacuum full, it's a good idea to reindex afterward. Vacuum full seems to result in a lot of index bloat (which, BTW, is a good reason not to use it). > Note that while we're vacuuming Postgres is running in a single user > mode and not listening. There's no other connections to the database. > > The hardware itself is a dual-core Intel 2gHz w/ 4G ram and adequate > space in a hardware Raid 5 configuration. I realize Raid 5 isn't > ideal, but in the general use cases of our database it doesn't > noticeably impact performance. Grab some systems data while the process is running. A few snapshots of top output, iostat, + anything you can think of to narrow down where the bottleneck is. However, with a 65G database and 4G of RAM to work with, I would expect that IO is going to be the bottleneck, and that comes back to using something faster than RAID 5. > How can we begin to cut down the vacuum time? My first thought is > simply change the schedule to run weekly (or biweekly) since we're > also running autovacuum. Are there any other Postgres configuration > changes that might help to improve vacuum performance? > > The settings we've explicitly set are: > > max_connections = 256 > shared_buffers = 40000 I use ~250000 shared_buffers on our large memory systems. I don't have conclusive evidence that going that high is really worth it, but I haven't had any performance problems with it. Note that with a DB as large as yours, this may cause problems with the memory available for work_mem, but you'll have to test your use case to know for sure. > temp_buffers = 5000 > work_mem = 32768 > maintenance_work_mem = 65535 > max_fsm_pages = 120000 > fsync = on > wal_buffers = 16 > effective_cache_size = 5000 > log_connections = on > log_duration = off > log_line_prefix = '%m [xid=%x] [%p]: ' > stats_start_collector = on > stats_command_string = on > stats_block_level = on > stats_row_level = on > autovacuum = on -- Bill Moran http://www.potentialtech.com