Hello, > At one point I envisioned making it smart enough to try and handle the > scenario you describe--on an idle system, you may very well want to write > out dirty and recently accessed buffers if there's nothing else going on. > But such behavior is counter-productive on a busy system, which is why a > similar mechanism that existed before 8.3 was removed. Making that only > happen when idle requires a metric for what "busy" means, which is tricky > to do given the information available to this particular process. > > Short version: if you never fill the buffer cache, buffers_clean will > always be zero, and you'll only see writes by checkpoints and things not > operating with the standard client buffer allocation mechanism. Which > brings us to... Sure. I am not really out to get the background writer to pre-emptively do "idle trickling". Though I can see cases where one might care about this (such as lessening the impact of OS buffer cache delays on checkpoints), it's not what I am after now. > > One theory: Is it the auto vacuum process? Stracing those I've seen > > that they very often to writes directly to disk. > > In order to keep it from using up the whole cache with maintenance > overhead, vacuum allocates a 256K ring of buffers and use re-uses ones > from there whenever possible. That will generate buffer_backend writes > when that ring fills but it has more left to scan. Your theory that all > the backend writes are coming from vacuum seems consistant with what > you've described. The bit that is inconsistent with this theory, given the above ring buffer desription, is that I saw the backend write-out count increasing constantlyduring the write activity I was generating to the database. However (because in this particular case it was a small database used for some latency related testing), no table was ever large enough that 256k buffers would ever be filled by the process of vacuuming a single table. Most tables would likely have been a handful to a couple of hundred of pages large. In addition, when I say "constantly" above I mean that the count increases even between successive SELECT:s (of the stat table) with only a second or two in between. In the abscence of long-running vacuum's, that discounts vacuuming because the naptime is 1 minute. In fact this already discounted vacuuming even without the added information you provided above, but I didn't realize when originally posting. The reason I mentioned vacuuming was that the use case is such that we do have a lot of tables constantly getting writes and updates, but they are all small. Anything else known that might be generating the writes, if it is not vacuuming? > You might even want to drop the two background writer parameters you've > tweaked upwards back down closer to their original values. I get the > impression you might have increased those hoping for more background > writer work because you weren't seeing any. If you ever do get to where > your buffer cache is full and the background writer starts doing > something, those could jump from ineffective to wastefully heavy at that > point. I tweaked it in order to eliminate backends having to do "synchrounous" (with respect to the operating system even if not with respect to the underlying device) writes. The idea is that writes to the operating system are less understood/controlled, in terms of any latency they may case. It would be very nice if the backend writes were always zero under normal circumstances (or at least growing very very rarely in edge cases where the JIT policy did not suceed), in order to make it a more relevant and rare observation that the backend write-outs are systematically increasing. On this topic btw, was it considered to allow the administrator to specify a fixed-size margin to use when applying the JIT policy? (The JIT policy and logic itself being exactly the same still.) Especially with larger buffer caches, that would perhaps allow the administrator to make a call to truly eliminate synchronous writes during normal operation, while not adversely affecting anything (if the buffer cache is 1 GB, having a margin of say 50 MB does not really matter much in terms of wasting memory, yet could have a significant impact on eliminating synchronous write-outs). On a system where you really want to keep backend writes to exactly 0 under normal circumstances (discounting vacuuming), and having a large buffer cache (say the one gig), it might be nice to be able to say "ok - I have 1 GB of buffer cache. for the purpose of the JIT algorithm, please pretend it's only 900 MB". The result is 100 MB of constantly sized "margin", with respect to ensuring writes are asynchronous. -- / Peter Schuller PGP userID: 0xE9758B7D or 'Peter Schuller <peter.schuller@xxxxxxxxxxxx>' Key retrieval: Send an E-Mail to getpgpkey@xxxxxxxxx E-Mail: peter.schuller@xxxxxxxxxxxx Web: http://www.scode.org
Attachment:
pgpvyRq0E3Pgh.pgp
Description: PGP signature