Hi, On 2013-03-25 13:31:17 -0700, Josh Berkus wrote: > In the past, setting vacuum_freeze_min_age (vfma) really low (say to > 10000 or 50000) would have caused lots of extra writing work due to > dirtying extra pages for freezing. This has been our stated reason to > keep vfma high, despite the obvious advantage of freezing tuples while > they're still in the cache. > > With the visibility map, though, vfma should only be dirtying pages > which vacuum is already visiting because there's dirty tuples on the > page. That is, pages which vacuum will probably dirty anyway, freezing > or not. (This is assuming one has applied the 9.2.3 update.) > > Given that, it seems like the cost of lowering vfma *should* be > marginal. The only extra work done by a lower vfma should be: > > 1. extra cpu time to put in the froxenXIDs on vacuumed pages, and > 2. dirtying the minority of pages which vacuum decided to scan, but not > write to. It will also often enough lead to a page being frozen repeatedly which causes unneccessary IO and WAL traffic. If a page contains pages from several transactions its not unlikely that some tuples are older and some are newer than vfma. That scenario isn't unlikely because of two scenarios: - INSERT/UPDATE reusing space on older pages where tuples have been deleted. - When a backend extends a relation that page is *not* known to have free space to other relations. Until vacuum comes along for the first time only this backend will use its space. Given that busy clusters frequently burn loads of xids per second it is not uncommon to have a wide range of xids on such a page. > And are there other costs I'm not thinking of? I think (but am not 100% sure right now) it would have another rather big cost: When a page contains freezable items, as determined by freeze_min_age, and we are doing a full table scan we won't skip buffers that we can't lock for cleanup. Instead we will wait and then lock them for cleanup. So I think this would be rather noticeably impact the speed of vacuum (since it waits more often) and concurrency (since we lock more buffers than before, even if they are actively used). Makes sense? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance