Re: CREATE TABLE slowing down significantly over time

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Scott Marlowe wrote:
Also note that the argument that autovacuum chews up too much IO is
moot now that you can set cost delay to 10 to 20 milliseconds.  Unless
you're running on the hairy edge of maximum IO at all times, autovac
should be pretty much unnoticed
And if you're running on the hairy edge like that, you really need autovacuum whether you think you can afford it or not. Badly maintained tables are also I/O intensive, and it's easy for someone who thinks "I'm too busy to allocate VACUUM time" to end up wasting more resources than it would have taken to just do things right in the first place. I see way too many people who suffer from false economy when it comes to autovacuum planning.

Some comments on this whole discussion:

1) You don't end up with dead rows [auto]vacuum needs to clean up just when you delete things. They show up when you UPDATE things, too--the original row isn't removed until after the new one is written. The overhead isn't as bad on UPDATEs in 8.3 or later, but just because you don't delete doesn't mean you don't need VACUUM to clean up dead stuff.

2) Any time you find yourself considering VACUUM FULL to clean things up, you're probably making a mistake, because the sort of situations it's the only tool to recover from tend to be broader disasters. The guidelines in http://developer.postgresql.org/pgdocs/postgres/routine-vacuuming.html spell out my feeling here as a tip: "the best way is to use CLUSTER or one of the table-rewriting variants of ALTER TABLE". If you're fighting performance issues because of some amount of background mismanagement with an unknown amount of table garbage in the past, it's quite possible you'll find the reinvigorated performance you get from CLUSTER worth the maintenance cost of needing an exclusive lock for it to run for a while. See http://it.toolbox.com/blogs/database-soup/getting-rid-of-vacuum-full-feedback-needed-33959 for more information. I run CLUSTER all the time, and every time I think I'm saving time by doing VACUUM FULL/REINDEX instead I regret it as another false economy. Turn on autovacuum, make it run all the time but at a slower average speed if you're concerned about its overhead, and use CLUSTER once to blow away the accumulated bloat from before you were doing the right things.

3) There is a lot of speculation here and no measurements. What I'd be doing in this case is running something like the query at http://wiki.postgresql.org/wiki/Disk_Usage (but without the lines that filter out pg_catalog because the catalogs are a strong suspect here) regularly while debugging the problem here. Measure how big all the catalog tables and indexes are, do your operations that make things better or worse, then measure again. Turn autovacuum on, repeat the test, see if things are different. This type of problem tends to be really easy to quantify.

--
Greg Smith    greg@xxxxxxxxxxxxxxx    Baltimore, MD


--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux