On 11/20/2012 10:13 AM, Merlin Moncure wrote:
have you ruled out numa issues? (http://frosty-postgres.blogspot.com/2012/08/postgresql-numa-and-zone-reclaim-mode.html)
Haha. Yeah. Our zone reclaim mode off, and node distance is 10 or 20. ZCM is only enabled by default if distance is > 20, unless there's some kernel bug triggering it even when zone reclaim reports as off.
I'll also note that my tests with interleave made no difference at all. At least not with pgbench. There was a small amount of flux with TPS, but the page swap storms came regardless of NUMA tweaks. The formula worked like this:
High connection count + high shared_buffers = page swap storm. I'll note that 8GB -> 4GB immediately stopped the paging everywhere, and the OS went from using 13GB for active file cache, to 45GB. I can't see how PG would cause something like that by itself.
I only piped in because it's very adversely affecting our CPU load in a similarly inexplicable-but-seemingly-the-scheduler way.
-- Shaun Thomas OptionsHouse | 141 W. Jackson Blvd. | Suite 500 | Chicago IL, 60604 312-444-8534 sthomas@xxxxxxxxxxxxxxxx ______________________________________________ See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general