Josh:
Are you able to share your systemtap script? Our problem will be to try and regenerate the same amount of traffic/load that we see in production. We could replay our queries, but we don't even capture a full set because it'd be roughly 150GB per day.
johnny
On Thu, Feb 7, 2013 at 12:49 PM, Josh Krupka <jkrupka@xxxxxxxxx> wrote:
Will be interested to see what you find Johnny.So far when I run the script for a short period of time that I know THP compactions are happening, I have been able to match up the compaction duration collected via systemtap with a query in the pg logs that took that amount of time or slightly longer (as expected). A lot of these are only a second or so, so I haven't been able to catch everything, but at least the data I am getting is consistent.- at the end spit out the collected info.- keeps track of the number of calls to it and aggregate time spent in it by process- probes the compaction functionJust as an update from my angle on the THP side... I put together a systemtap script last night and so far it's confirming my theory (at least in our environment). I want to go through some more data and make some changes on our test box to see if we can make it go away before declaring success - it's always possible two problems are intertwined or that the THP thing is only showing up because of the *real* problem... you know how it goes.Basically the systemtap script does this: