On Tue, 22 Apr 2008, David Wilson wrote:
My guess at this point is that I'm just running into index update times and checkpoint IO. The only thing that still seems strange is the highly variable nature of the COPY times- anywhere from <1.0 seconds to >20 seconds, with an average probably around 8ish.
Have you turned on log_checkpoints to see whether those are correlated with the slow ones? Given that you've had an improvement by increasing checkpoint_segments, it's not out of the question to think that maybe you're still getting nailed sometimes during the more stressful portions of the checkpoint cycle (usually right near the end). The 1 second ones just might just happen to be ones that start just as the previous checkpoint finished. To make lining those up easier, you might turn on logging of long statements with log_min_duration_statement to see both bits of data in the same log file. That might get you some other accidental enlightenment as well (like if there's some other statement going on that's colliding with this load badly).
This is a bit out of my area, but after reading the rest of this thread I wonder whether raising the default_statistics_target parameter a bit might reduce the instances of bad plans showing up.
-- * Greg Smith gsmith@xxxxxxxxxxxxx http://www.gregsmith.com Baltimore, MD