Thanks for the numbers. Just out of curiosity, do you happen to know how much more expensive compared to that a read from disk is? And also, how much the pinning can be slowed down, when having to iterate using the clock-sweep method over large shared_buffers?
The reason why I mentioned buffers_clean is because I was assuming that under "healthy" conditions, most writes should be done by checkpointer, because, as it was already mentioned, that's the most efficient way of writing (no duplicate writes of the same buffer, write optimizations etc.). I was thinking about bgwriter as a way of reducing latency by avoiding the case when a backend has to write buffers by itself. So that would mean that big numbers in buffers_clean and buffers_backend compared to buffers_checkpoint, would mean that a lot of writes are done not by checkpointer, and thus probably less efficiently than they could be. That might have resulted in IO writes being more random, and more IO writes done in general, because same buffer can be written multiple times between checkpoints.
We do daily manual vacuuming. Knowing what part of total writes is accounted for them indeed would be nice. When looking at buffers_checkpoint/buffers_clean/buffers_backend, I was saving the numbers with several hours interval, knowing that there are no vacuums running at that time, and calculated the difference.
We have mixed type of DB usage. There is OLTP-like part with many small read/write transactions. Predictable latency does not matter in that case, but throughput does, because that is basically a background data loading job. Then there is an OLAP-like part when heavier report queries are being run. Then there are more background jobs which are a combination of both, which at first run long queries and then do lots of small inserts, thus pre-calculating some data for bigger reports. After increasing shared_buffers 8GB -> 64GB, there was 7% improvement in run time of the background pre-calculating job (measured by running several times in a row, and caches are hot). When we configured hugepages for the bigger shared_buffers, the additional improvement was around 3%. Regards, Vitaliy |