On Sun, 4 Oct 2009, Gerhard Wiesinger wrote:
On Fri, 2 Oct 2009, Scott Marlowe wrote:
I found that lowering checkpoint completion target was what helped.
Does that seem counter-intuitive to you?
I set it to 0.0 now.
If you set that to 0.0, the whole checkpoing spreading logic doesn't apply
like it's supposed to. I'm not sure what the results you posted mean now.
If you had it set to 0 and saw a bad spike (which is how I read your
message), I'd say "yes, that's what happens when you do reduce that
parameter, so don't do that". If you meant something else please clarify.
Thanks for the dtrace example, I suggested we add those checkpoint probes
in there and someone did, but I hadn't seen anybody use them for anything
yet.
Bug1: usage_count is IHMO not consistent
It's a bit hack-ish, but the changes made to support multiple buffer use
strategies introduced by the "Make large sequential scans and VACUUMs work
in a limited-size ring" commit are reasonable even if they're not as
consistent as we'd like. Those changes were supported by benchmarks
proving their utility, which always trump theoretical "that shouldn't work
better!" claims when profiling performance.
Also, they make sense to me, but I've spent a lot of time staring at
pg_buffercache output to get a feel for what shows up in there under
various circumstances. That's where I'd suggest you go if this doesn't
seem right to you; run some real database tests and use pg_buffercache to
see what's inside the cache when you're done. What's in there and what I
expected to be in there weren't always the same thing, and it's
interesting to note how that changes as shared_buffers increases. I
consider some time studying that a pre-requisite to analyzing performance
of this code.
Bug2: Double iteration of buffers
As you can seen in the calling tree below there is double iteration with
buffers involved. This might be a major performance bottleneck.
Hmmm, this might be a real bug causing scans through the buffer cache to
go twice as fast as intended. Since the part you suggest is doubled isn't
very intensive or called all that often, there's no way it can be a major
issue though. That's based on knowing what the code does and how much it
was called, as well as some confidence that if it were really a *major*
problem, it would have shown up on the extensive benchmarks done on all
the code paths you're investigating.
BTW: Are there some tests available how fast a buffer cache hit is and a disk
cache hit is (not in the buffer cache but in the disk cache)? I'll asked,
because a lot of locking is involved in the code.
I did some once but didn't find anything particularly interesting about
the results. Since you seem to be on a research tear here, it would be
helpful to have a script to test that out available, I wasn't able to
release mine and something dtrace based would probably be better than the
approach I used (I threw a bunch of gettimeofdata calls into the logs and
post-processed them with a script).
BTW2: Oracle buffercache and background writer strategy is also interesting.
As a rule, we don't post links to other database implementation details
here, as those can have patented design details we'd prefer not to
intentionally re-implement. Much of Oracle's design here doesn't apply
here anyway, as it was done in the era when all of their writes were
synchronous. That required them to worry about doing a good job on some
things in their background writer that we shrug off and let os writes
combined with fsync handle instead.
--
* Greg Smith gsmith@xxxxxxxxxxxxx http://www.gregsmith.com Baltimore, MD
--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general