Craig Ringer wrote:
It's also complicated by the fact that Pg's architecture is very poorly suited to prioritizing I/O based on query or process. (AFAIK) basically all writes go through shared_buffers and the bgwriter - neither Pg nor in fact the OS know what query or what backend created a given set of block writes.
You're correct that all writes go through shared_buffers, and all information about the query that dirties the page in the first place is gone by the time it's written out. In 8.3 and later, buffers get written three ways:
(1) A backend needs to allocate a buffer to do some work. The buffer it is allocated is dirty. In this case, the backend itself ends up writing the page out.
(2) The background writer monitors how many allocations are going on, and it tries to keep ahead of the backends by writing pages likely to be re-used in the near future out before (1) happens. (This is the part that was different in earlier versions--the background writer just roamed the whole buffer cache looking for work to do before, unrelated to the amount of activity on the system).
(3) Checkpoints (which are also executed by the background writer) have to write out every dirty buffer in order to reconcile everything between memory and disk.
One reason you can't just ionice the backend and make all the problems go away is (3); you can't let a sluggish backend stop checkpoints from happening.
You might note that only one of these sources--a backend allocating a buffer--is connected to the process you want to limit. If you think of the problem from that side, it actually becomes possible to do something useful here. The most practical way to throttle something down without a complete database redesign is to attack the problem via allocation. If you limited the rate of how many buffers a backend was allowed to allocate and dirty in the first place, that would be extremely effective in limiting its potential damage to I/O too, albeit indirectly. Trying to limit the damage on the write and OS side instead is a dead end, you'll never make that work without a major development job--one that I would bet against ever being committed even if someone did it for a specific platform, because they're all going to be so different and the code so hackish.
-- Greg Smith 2ndQuadrant Baltimore, MD PostgreSQL Training, Services and Support greg@xxxxxxxxxxxxxxx www.2ndQuadrant.com -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance