On Thu, 6 Nov 2008, Peter Schuller wrote:
In order to keep it from using up the whole cache with maintenance
overhead, vacuum allocates a 256K ring of buffers and use re-uses ones
from there whenever possible.
no table was ever large enough that 256k buffers would ever be filled by
the process of vacuuming a single table.
Not 256K buffers--256K, 32 buffers.
In addition, when I say "constantly" above I mean that the count
increases even between successive SELECT:s (of the stat table) with
only a second or two in between.
Writes to the database when only doing read operations are usually related
to hint bits: http://wiki.postgresql.org/wiki/Hint_Bits
On this topic btw, was it considered to allow the administrator to
specify a fixed-size margin to use when applying the JIT policy?
Right now, there's no way to know exactly what's in the buffer cache
without scanning the individual buffers, which requires locking their
headers so you can see them consistently. No one process can get the big
picture without doing something intrusive like that, and on a busy system
the overhead of collecting more data to know how exactly far ahead the
cleaning is can drag down overall performance. A lot can happen while the
background writer is sleeping.
One next-generation design which has been sketched out but not even
prototyped would take cleaned buffers and add them to the internal list of
buffers that are free, which right now is usually empty on the theory that
cached data is always more useful than a reserved buffer. If you
developed a reasonable model for how many buffers you needed and padded
that appropriately, that's the easiest way (given the rest of the buffer
manager code) to get close to ensuring there aren't any backend writes.
Because you've got the OS buffering writes anyway in most cases, it's hard
to pin down whether that actually improved worst-case latency though. And
moving in that direction always seems to reduce average throughput even in
write-heavy benchmarks.
The important thing to remember is that the underlying OS has its own read
and write caching mechanisms here, and unless the PostgreSQL ones are
measurably better than those you might as well let the OS manage the
problem instead. It's easy to demonstrate that's happening when you give
a decent amount of memory to shared_buffers, it's much harder to prove
that's the case for an improved write scheduling algorithm. Stepping back
a bit, you might even consider that one reason PostgreSQL has grown as
well as it has in scalability is exactly because it's been riding
improvements the underlying OS in many of these cases, rather than trying
to do all the I/O scheduling itself.
--
* Greg Smith gsmith@xxxxxxxxxxxxx http://www.gregsmith.com Baltimore, MD
--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance