Re: postgresql.conf recommendations

Johnny Tan <johnnydtan@xxxxxxxxx> · Sat, 9 Feb 2013 08:19:33 -0500

Josh:
Are you able to share your systemtap script? Our problem will be to try and regenerate the same amount of traffic/load that we see in production. We could replay our queries, but we don't even capture a full set because it'd be roughly 150GB per day.

johnny

On Thu, Feb 7, 2013 at 12:49 PM, Josh Krupka <jkrupka@xxxxxxxxx> wrote:

Just as an update from my angle on the THP side... I put together a systemtap script last night and so far it's confirming my theory (at least in our environment).  I want to go through some more data and make some changes on our test box to see if we can make it go away before declaring success - it's always possible two problems are intertwined or that the THP thing is only showing up because of the *real* problem... you know how it goes.

Basically the systemtap script does this:
- probes the compaction function
- keeps track of the number of calls to it and aggregate time spent in it by process
- at the end spit out the collected info.

So far when I run the script for a short period of time that I know THP compactions are happening, I have been able to match up the compaction duration collected via systemtap with a query in the pg logs that took that amount of time or slightly longer (as expected).  A lot of these are only a second or so, so I haven't been able to catch everything, but at least the data I am getting is consistent.

Will be interested to see what you find Johnny.