Re: Odd sudden performance degradation related to temp object churn

Mark Kirkwood <mark.kirkwood@xxxxxxxxxxxxxxx> · Tue, 22 Aug 2017 13:04:13 +1200

On 19/08/17 13:49, Mark Kirkwood wrote:

On 19/08/17 02:21, Jeremy Finzel wrote:
On Tue, Aug 15, 2017 at 12:07 PM, Scott Marlowe 
<scott.marlowe@xxxxxxxxx <mailto:scott.marlowe@xxxxxxxxx>> wrote:

    So do iostat or iotop show you if / where your disks are working
    hardest? Or is this CPU overhead that's killing performance?

Sorry for the delayed reply. I took a look in more detail at the 
query plans from our problem query during this incident. There are 
actually 6 plans, because there were 6 unique queries.  I traced one 
query through our logs, and found something really interesting. That 
is that all of the first 5 queries are creating temp tables, and all 
of them took upwards of 500ms each to run.  The final query, however, 
is a simple select from the last temp table, and that query took 
0.035ms!  This really confirms that somehow, the issue had to do with 
/writing /to the SAN, I think.  Of course this doesn't answer a whole 
lot, because we had no other apparent issues with write performance 
at all.

I also provide some graphs below.

Hi, graphs for latency (or await etc) might be worth looking at too - 
sometimes the troughs between the IO spikes are actually when the 
disks have been overwhelmed with queued up pending IOs...

Sorry - I see you *did* actually have iowait in there under your CPU 
graph...which doesn't look to be showing up a lot of waiting. However 
still might be well worth getting graphs showing per device waits and 
utilizations.

regards

Mark

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance