Re: Performance regressions found using sqlfuzz

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





Andres, Andrew, and Tom:

Thanks for your insightful comments! We conducted additional analysis based on your comments and would like to share the results. We would also like to get your feedback on a few design decisions to increase the utility of our performance regression reports.

####### QUERY 2:

As Andres and Andrew correctly pointed out, this regression is not observed after we increase the work_mem parameter.

  * work_mem = 256MB
  * shared_buffers = 1024MB
  * temp_buffers = 64MB

We wanted to check if these settings look good to you. If so, we will use them for validating the regressions identified by sqlfuzz.

####### QUERY 3:

Surprisingly, the performance impact of this regression is more prominent on larger databases. We concur with Andrew in that this might be related to the lack of a Materialize node and mis-costing of the Nested Loop Anti-Join.

When we increased the scale-factor of TPC-C to 300 (~30 GB), this query ran three times slower on v11 (24 seconds) in comparison to v9.5 (7 seconds). We also found more than 15 regressions related to the same commit and share a couple of them below.

Commit: 77cd477 (Enable parallel query by default.)

Summary: Execution Time (milliseconds)

+----------------------+--------+---------+---------+-----------+
|                      | scale1 | scale10 | scale50 | scale 300 |
+----------------------+--------+---------+---------+-----------+
| Case-3 (v9.5)        |     28 |     248 |    1231 |      7265 |
| Case-3 (v11)         |     74 |     677 |    3345 |     24581 |
+----------------------+--------+---------+---------+-----------+
| Case-3A (v9.5)       |     88 |     937 |    4721 |     27241 |
| Case-3A (v11)        |    288 |    2822 |   13838 |     85081 |
+----------------------+--------+---------+---------+-----------+
| Case-3B (v9.5)       |    101 |     934 |    4824 |     29363 |
| Case-3B (v11)        |    200 |    2331 |   12327 |     74110 |
+----------------------+--------+---------+---------+-----------+


###### QUERY 3A:

select  
  ref_0.ol_delivery_d as c1
from
  public.order_line as ref_0
where EXISTS (
  select  
    ref_1.i_im_id as c0      
    from
    public.item as ref_1
    where ref_0.ol_d_id <=  ref_1.i_im_id
  )

 Execution plan:

  [old]
Nested Loop Semi Join  (cost=0.00..90020417940.08 rows=30005835 width=8) (actual time=0.034..24981.895 rows=90017507 loops=1)
  Join Filter: (ref_0.ol_d_id <= ref_1.i_im_id)
  ->  Seq Scan on order_line ref_0  (cost=0.00..2011503.04 rows=90017504 width=12) (actual time=0.022..7145.811 rows=90017507 loops=1)
  ->  Materialize  (cost=0.00..2771.00 rows=100000 width=4) (actual time=0.000..0.000 rows=1 loops=90017507)
        ->  Seq Scan on item ref_1  (cost=0.00..2271.00 rows=100000 width=4) (actual time=0.006..0.006 rows=1 loops=1)
Planning time: 0.290 ms
Execution time: 27241.239 ms

  [new]
Gather  (cost=1000.00..88047487498.82 rows=30005835 width=8) (actual time=0.265..82355.289 rows=90017507 loops=1)
  Workers Planned: 2
  Workers Launched: 2
  ->  Nested Loop Semi Join  (cost=0.00..88044485915.32 rows=12502431 width=8) (actual time=0.033..68529.259 rows=30005836 loops=3)
        Join Filter: (ref_0.ol_d_id <= ref_1.i_im_id)
        ->  Parallel Seq Scan on order_line ref_0  (cost=0.00..1486400.93 rows=37507293 width=12) (actual time=0.023..2789.901 rows=30005836 loops=3)
        ->  Seq Scan on item ref_1  (cost=0.00..2271.00 rows=100000 width=4) (actual time=0.001..0.001 rows=1 loops=90017507)
Planning Time: 0.319 ms
Execution Time: 85081.158 ms

###### QUERY 3B:

select  
  ref_0.ol_i_id as c0
from
  public.order_line as ref_0
where EXISTS (
  select  
    ref_0.ol_delivery_d as c0
    from
    public.order_line as ref_1
    where ref_1.ol_d_id <= cast(nullif(ref_1.ol_o_id, ref_0.ol_i_id) as int4))

 Execution plan:

[old]
Nested Loop Semi Join  (cost=0.00..115638730740936.53 rows=30005835 width=4) (actual time=0.017..27009.302 rows=90017507 loops=1)
  Join Filter: (ref_1.ol_d_id <= NULLIF(ref_1.ol_o_id, ref_0.ol_i_id))
  Rows Removed by Join Filter: 11557
  ->  Seq Scan on order_line ref_0  (cost=0.00..2011503.04 rows=90017504 width=4) (actual time=0.009..7199.540 rows=90017507 loops=1)
  ->  Materialize  (cost=0.00..2813221.56 rows=90017504 width=8) (actual time=0.000..0.000 rows=1 loops=90017507)
        ->  Seq Scan on order_line ref_1  (cost=0.00..2011503.04 rows=90017504 width=8) (actual time=0.001..0.002 rows=14 loops=1)
Planning time: 0.252 ms
Execution time: 29363.737 ms

[new]
Gather  (cost=1000.00..84060490326155.39 rows=30005835 width=4) (actual time=0.272..71712.491 rows=90017507 loops=1)
  Workers Planned: 2
  Workers Launched: 2
  ->  Nested Loop Semi Join  (cost=0.00..84060487324571.89 rows=12502431 width=4) (actual time=0.046..60153.472 rows=30005836 loops=3)
        Join Filter: (ref_1.ol_d_id <= NULLIF(ref_1.ol_o_id, ref_0.ol_i_id))
        Rows Removed by Join Filter: 1717
        ->  Parallel Seq Scan on order_line ref_0  (cost=0.00..1486400.93 rows=37507293 width=4) (actual time=0.023..2819.361 rows=30005836 loops=3)
        ->  Seq Scan on order_line ref_1  (cost=0.00..2011503.04 rows=90017504 width=8) (actual time=0.001..0.001 rows=1 loops=90017507)
Planning Time: 0.334 ms
Execution Time: 74110.942 ms

####### QUERY 4:

As Tom pointed out, this regression is not present in DEV head. We will not report regressions related to NULLIF in the future.

####### QUERY 1:

This regression is also not present in DEV head. We will validate our regressions on DEV head in the future report.

Best, 

Jinho Jung



From: Tom Lane <tgl@xxxxxxxxxxxxx>
Sent: Saturday, February 16, 2019 5:37:49 PM
To: Andres Freund
Cc: Jung, Jinho; Jeff Janes; pgsql-performance@xxxxxxxxxxxxxx
Subject: Re: Performance regressions found using sqlfuzz
 
Andres Freund <andres@xxxxxxxxxxx> writes:
> On 2019-02-14 17:27:40 +0000, Jung, Jinho wrote:
>> - Our analysis: We believe that this regression has to do with two factors: 1) conditional _expression_ (e.g., LEAST or NULLIF) are not reduced to constants unlike string functions (e.g., CHAR_LENGTH)  2) change in the cost estimation function for bitmap scan. Execution time grows by 3 orders of magnitude.  We note that this regression is only observed on large databases (e.g., scale factor of 50).

> Hm. The primary problem here is that the estimation both before and
> after are really bad. So I don't think the commit you point out here is
> really to blame.  I'm not that bothered by the query not being great,
> given the weird construction with LEAST(), but we probably could fix
> that pretty easily.

We already did:

Author: Tom Lane <tgl@xxxxxxxxxxxxx>
Branch: master [6f19a8c41] 2018-12-30 13:42:04 -0500

    Teach eval_const_expressions to constant-fold LEAST/GREATEST expressions.
   
    Doing this requires an assumption that the invoked btree comparison
    function is immutable.  We could check that explicitly, but in other
    places such as contain_mutable_functions we just assume that it's true,
    so we may as well do likewise here.  (If the comparison function's
    behavior isn't immutable, the sort order in indexes built with it would
    be unstable, so it seems certainly wrong for it not to be so.)
   
    Vik Fearing
   
    Discussion: https://postgr.es/m/c6e8504c-4c43-35fa-6c8f-3c0b80a912cc@xxxxxxxxxxxxxxx

BTW, const-folding NULLIF would not be a similarly tiny fix, because
it would need to check for immutability of the underlying operator
(since it is possibly a cross-type comparison, we can't get
away with just assuming it's immutable).  I'm not convinced that
case is worth carrying extra code for.

                        regards, tom lane

[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux