Re: Cost of creating an emply WAL segment

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 06/24/2011 10:43 AM, Devrim GÜNDÜZ wrote:
Has anyone measured the cost of creating empty WAL segments while the
database is running?

Actually, when is the new file created? Just after one segment is filled
up, or some time before then? What happens during WAL segment creation?
If there are pending transactions to be committed, do we see a delay?

Excellent set of questions. Yes, and it can be disturbingly high on a fresh system that's immediately hit with lots of transactions, or on one where checkpoint_segments was just raised a lot and then nailed with activity. The problem also pops up in an even worse way when you have a server that's overrun its normal checkpoint finish time, usually due to a slow sync phase. That then also results in creation of new segments--necessary because all of the old ones that would normally be recycled already have been recycled--and that work is now competing against the already slow checkpoint writes, and backends running their own fsync too just to make the mix extra exciting.

I know the server code does try to stay a bit ahead of this problem, by creating one segment in advance under conditions I forget the details of, to reduce the odds a client will actually hit a delay there. It helps the average case. But since it doesn't do much for the worst-case ones people that make my life miserable, working on that doesn't make my life easier; therefore I don't.

The main argument in favor of pre-populating WAL segments early after the database starts is that it would probably be a small cheat on some benchmarks, moving a bit of work that happens during the test to happen before then instead when it isn't counted. But I've never been excited about the idea of creating empty segments near initdb time for 3 reasons:

-Default postgresql.conf at initdb time has the pathetically small 3 checkpoint_segments, so it won't actually work unless we increase it first. -Most servers go through a data loading stage before they hit production that takes care of this anyway. -The worst problems I've ever seen in this area, by far (as in: at least 100X worst than what you're asking about), are when new segments are created due to heavy write activity exhausting the list of ones to be recycled during a checkpoint.

To give you an example of *that*, here is the new record-setting slow checkpoint I just found in my inbox this morning, from a system that's been instrumented for its increasing checkpoint issues the last month:

checkpoint complete: wrote 63297 buffers (6.0%); 0 transaction log file(s) added, 938 removed, 129 recycled; write=250.384 s, sync=14525.296 s, total=14786.868 s

Here checkpoint_segments=64 and shared_buffers=8GB. The fact that this checkpoint hit the "create a new empty WAL segment" code 938 times during its 4 hour checkpoint sync phase is much more troubling than the pauses people run into when making a new segments on a fresh server. I would argue that if your database is new enough to not have populated a full set of checkpoint_segments yet, it's also new enough that it can't possibly have enough data in it yet for that to really be a problem.

(Note for hackers following along this far: I strongly suspect today's latest pathological example is the sort improved by the "compact fsync request" feature added to 9.1. I'm now seeing so many of these in the field that I'm collecting up data to support the idea of backporting that in 8.3-9.0, as a bug fix because this turns into a "system is too slow to be considered operational" performance problem when it flares up. This was a rare thing I'd never seen before back in September when I started working on this area again; now I see it once a month on a new system somewhere.)

--
Greg Smith   2ndQuadrant US    greg@xxxxxxxxxxxxxxx   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux