Search Postgresql Archives

Re: Performance degradation when using auto_explain

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 01/04/2017 08:54 PM, Kisung Kim wrote:


On Wed, Jan 4, 2017 at 1:21 AM, Andreas Kretschmer
<akretschmer@xxxxxxxxxxxxx <mailto:akretschmer@xxxxxxxxxxxxx>> wrote:

    Kisung Kim <kskim@xxxxxxxxxxx <mailto:kskim@xxxxxxxxxxx>> wrote:

    > And finally I found that auto_explain is the cause of the problem.

    real hardware or virtual hardware? On virtual there are sometimes
    problems with exact timings, please read:

    https://www.postgresql.org/docs/current/static/pgtesttiming.html
    <https://www.postgresql.org/docs/current/static/pgtesttiming.html>


Thank you for your reply.
I use real hardware.
I am curious timing functions have contentions when multi-threads call them.


It's not so much about contention between threads/processes, but mostly about the cost of actually reading data from the clock source. So even if you run on physical hardware, the clock source may be slow. Actually, there may be multiple clock sources available, differing in precision and overhead.

See:

/sys/devices/system/clocksource/*/available_clocksource

On my machine I see 3 different sources "tsc hpet acpi_pm" and after running the pg_test_timing tool, linked by Andreas, and I get this for 'tsc' clock source

Testing timing overhead for 3 seconds.
Per loop time including overhead: 29.87 nsec
Histogram of timing durations:
< usec   % of total      count
     1     97.09159   97499400
     2      2.90085    2913031
     4      0.00193       1936
     8      0.00089        891
    16      0.00140       1405
    32      0.00335       3366
    64      0.00000          2

suggesting that 97% of calls took less than 1 usec, which is quite good. For comparison, using 'hpet' gives me this:

Testing timing overhead for 3 seconds.
Per loop time including overhead: 766.92 ns
Histogram of timing durations:
  < us   % of total      count
     1     27.69558    1083389
     2     71.28437    2788485
     4      0.75748      29631
     8      0.02886       1129
    16      0.06578       2573
    32      0.16755       6554
    64      0.00033         13
   128      0.00003          1
   256      0.00003          1

Which is clearly much worse (it increased the per-loop cost from 30ns to 767ns, which is ~25x more).

So check which clock source you have selected, and test how expensive that is. But even with a fast clock source, the additional timing overhead may make EXPLAIN ANALYZE considerably slower. There's not much we can do about it :-(

And to make matters worse, it may affect different plans differently (e.g. nested loop joins do many more gettimeofday calls than other join types, amplifying the timing costs).

But the question is whether you actually need the timing - the total duration + row counts are far more important in my experience, so by setting

auto_explain.log_analyze = on
auto_explain.log_timing = off

you may significantly reduce the impact on query performance, while retaining the most valuable information.

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux