2011/10/7 Vincent de Phily <vincent.dephily@xxxxxxxxxxxxxxxxx>: > On Thursday 06 October 2011 12:09:30 Andrew Sullivan wrote: >> On Thu, Oct 06, 2011 at 03:39:02PM +0200, Vincent de Phily wrote: >> > In the case of vacuuming however, I think there's a point to be made >> > about finishing fast when all vacuum workers are constantly busy : say >> > the vacuum daemon notices that there are 10 tables that need vacuuming >> > now. It allocates 3 workers, but while they do their intentionally-slow >> > work, the other 7 tables keep creating more vacuumable tuples, so >> > it'll be more work overall because they're "late" in their "vacuum >> > schedule". Does that make sense (I'm not sure id does) ? >> >> Yes, that's exactly the issue. You need to balance the resource >> you're depriving the "real" database transactions (i.e. the user ones) >> against the cost of waiting, which waiting will probably cost those >> user transactions in performance. The reason there's no magic >> solution is because much of this depends on your use patterns. > > Ok, I'm glad my reasoning wasn't completely flawed :) > >> > Anyway, my particular issue is solved for now : I realized those tables >> > were terribly bloated (often more than 99% slack), so I vacuum-fulled >> > them and now the autovacuums run very fast and the disk is 90% idle >> > again. That slack probably appeared at table initialization time >> > because the fsm was not big enough. I since raised the fsm, but I think >> > it's big enough during normal (non-init) usage anyway. >> >> This is what you want to keep an eye on, then. > > Yup, watching that. > >> Why do you think it >> came from "initialization time", though? VACUUM only has work to do >> when dead tuples show up (e.g. from DELETE or UPDATE), and normally >> when you first populate a table you do a COPY, which isn't going to >> create dead tuples. > > Those tables are a hand-made trigger-maintained "materialized view" created > about 2 months ago. Initializing them meant doing a full seqscan of the > reference table and doing one insert and 1-2 updates for each row in the MV > table. And the work was split in thousands of transactions with a load- > dependent sleep between them, in order to not impact user queries. Those > updates (and some inserts) still hapen during normal usage, but at a much > slower pace which autovacuum should have no trouble keeping up with. > > >> > I'm still interested in more opinions about my two questions : >> > * When does it make sense to make autovacuum more aggressive on IO, >> > and by> >> > how much ? >> >> At bottom, you don't want your tables to get so bloated that they >> exhibit the problem you just saw, but you also don't want vacuum to be >> taking so much I/O that your other tasks can't get done. That's the >> general principle; how it applies to your case depends rather on use >> patters. For instance, if you know that there will be at most 10% >> churn on every table every day, but all transactions happen between >> 9:00 and 17:00 local time, then it's probably safe to allow that to >> happen: as long as your FSM can keep track, it can all be recovered >> every day after 17:00, so you might as well allow the work to build >> up, & let the vacuums happen when they're not stealing any I/O from >> user queries. If, on the other hand, you get 100% churn on 50% of the >> tables every day between 09:00 and 11:00, and the rest of the day is >> mostly read-only traffic, with read-only traffic during all 24 hours >> (don't scoff -- I had exactly this problem once) then you want to be >> quite aggressive with the autovacuum settings, because keeping that >> 100% bloat down is going to pay off in a big way on the read-only >> traffic. > > Interesting. Although if you have such well-defined churn times, it might be > better to vacuum from cron instead of from autovacuum ? You also don't want to > autovacuum now if you know your churn will be over in 15 min. Looks like it's > going to be hard to extract general rules. > > One of my motivations to make autovaccum more aggresive was that my fsm was > too small and I didn't want a PG restart to take the new value into account > yet. So "finish this vacuum faster and get on to the next one" was a way to do > that "next one" before the fsm overflowed. But I now realize it's a very bad > kludge, and I should just have my fsm sized right (or sized automatically; > have I already said that I long to upgrade ? :p) > >> > * Does vacuuming fill the OS's disk cache, and is it an issue if it >> > does ? >> Well, it _affects_ the OS's disk cache. Whether it fills it is >> controlled by the cache algorithms and the amount of memory you have >> devoted to cache. Every time you touch the disk, you potentially >> alter the cache in favour of what you just saw. >> >> In the above artificial examples, the vacuums that run "after everyone >> went home" will almost certainly end up taking over the cache, because >> there's no other activity to keep other things in the disk cache. In >> the second example, though, with a lot of read-only activity all the >> time, the things that are most popular are likely to remain in a >> (modern) disk cache most of the time because they're called so often >> that the vacuumed page doesn't end up being enough traffic to cause an >> eviction (or, anyway, to evict for any significant time). > > Ok, so say my churn happens only in the last 10 minutes of data and readonly > queries only look at the last 24 hours of data, if vacuuming is triggered > every 48 hours, that's 24 hours of data that will potentially get back into > the cache with no benefit (or if I'm not mistaken, with PG =< 8.3 it's much > more than 24 hours). Pity. Is there a counter-example where there is a caching > benefit to the current behaviour ? If not, that might be a low-hanging fruit > to improve postgres performance. Not a direct answer but some items after reading the thread: * 8.4 come with visibility map and it is nice to reduce IO usage (without trouble anymore with FSM_ GUC) * postgresql cache got its own logic, DB oriented. * operating system cache is proud enough to not waste all of your cache when reading one file sequentialy. * you may increase the number of autovacuum workers too, (depend of your IO and CPU) * it is better to change the autovacuum settings via cron than running vacuum. If you wonder, you can use pgfincore to track your OS cache usage per table&index and the PostgreSQL cache with pg_buffercache. Note that pgfincore is lock free, but pg_buffercache may impact your performance (it is still interesting to use it to check how your shared buffers are used and it can helps optimising your shared_memory size) -- Cédric Villemain +33 (0)6 20 30 22 52 http://2ndQuadrant.fr/ PostgreSQL: Support 24x7 - Développement, Expertise et Formation -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general