Search Postgresql Archives

Plan for update ... where a is not distinct from b

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[PostgreSQL 9.5.10 on x86_64-pc-linux-gnu, compiled by gcc (Debian
4.9.2-10) 4.9.2, 64-bit]

I noticed that an update was taking a long time and found this:

UPDATE public.facttable_imf_ifs p
SET [...lots of columns...]
FROM cleansing.cls_imf_ifs_facttable_imf_ifs c, cleansing.cleansing_change_type ct
WHERE
    (p.macrobondtimeseries is not distinct from c.macrobondtimeseries) AND (p.date is not distinct from c.date)
    AND c.cleansing_change_type_id = ct.cleansing_change_type_id
    AND ct.cleansing_change_type_desc_short IN ('UPDATED_NEW')
;

╔════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╗
║                                                                       QUERY PLAN                                                                       ║
╟────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╢
║ Update on facttable_imf_ifs p  (cost=1.09..1978353972070.75 rows=7969398 width=1498)                                                                   ║
║   ->  Nested Loop  (cost=1.09..1978353972070.75 rows=7969398 width=1498)                                                                               ║
║         Join Filter: ((NOT ((p.macrobondtimeseries)::text IS DISTINCT FROM (c.macrobondtimeseries)::text)) AND (NOT (p.date IS DISTINCT FROM c.date))) ║
║         ->  Seq Scan on facttable_imf_ifs p  (cost=0.00..1071317.45 rows=20727045 width=51)                                                            ║
║         ->  Materialize  (cost=1.09..2028790.72 rows=5454160 width=1472)                                                                               ║
║               ->  Hash Join  (cost=1.09..2001519.92 rows=5454160 width=1472)                                                                           ║
║                     Hash Cond: (c.cleansing_change_type_id = ct.cleansing_change_type_id)                                                              ║
║                     ->  Seq Scan on cls_imf_ifs_facttable_imf_ifs c  (cost=0.00..1824258.62 rows=32724962 width=1470)                                  ║
║                     ->  Hash  (cost=1.07..1.07 rows=1 width=8)                                                                                         ║
║                           ->  Seq Scan on cleansing_change_type ct  (cost=0.00..1.07 rows=1 width=8)                                                   ║
║                                 Filter: ((cleansing_change_type_desc_short)::text = 'UPDATED_NEW'::text)                                               ║
╚════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╝

Am I correct to assume that the result of Materialize is a flat table
(in memory) without any indexes and that the nested loop has to scan
that for each of the 20 million rows of the target table?

That's going to take a long time ...

Replacing the "is not distinct from" with "=" (which is possible in this
case because both columns are not null (and indeed the primary key), gives me this plan:

╔════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╗
║                                                         QUERY PLAN                                                         ║
╟────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╢
║ Update on facttable_imf_ifs p  (cost=2611816.89..4435860.27 rows=7969425 width=1498)                                       ║
║   ->  Merge Join  (cost=2611816.89..4435860.27 rows=7969425 width=1498)                                                    ║
║         Merge Cond: (((p.macrobondtimeseries)::text = (c.macrobondtimeseries)::text) AND (p.date = c.date))                ║
║         ->  Index Scan using facttable_imf_ifs_pkey on facttable_imf_ifs p  (cost=0.56..1541107.94 rows=20727117 width=51) ║
║         ->  Sort  (cost=2611811.12..2625446.52 rows=5454160 width=1472)                                                    ║
║               Sort Key: c.macrobondtimeseries, c.date                                                                      ║
║               ->  Hash Join  (cost=1.09..2001519.92 rows=5454160 width=1472)                                               ║
║                     Hash Cond: (c.cleansing_change_type_id = ct.cleansing_change_type_id)                                  ║
║                     ->  Seq Scan on cls_imf_ifs_facttable_imf_ifs c  (cost=0.00..1824258.62 rows=32724962 width=1470)      ║
║                     ->  Hash  (cost=1.07..1.07 rows=1 width=8)                                                             ║
║                           ->  Seq Scan on cleansing_change_type ct  (cost=0.00..1.07 rows=1 width=8)                       ║
║                                 Filter: ((cleansing_change_type_desc_short)::text = 'UPDATED_NEW'::text)                   ║
╚════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╝

which obviously looks a lot nicer, but even replacing
(A is not distinct from B)
with the equivalent
(A = B or A is null and B is null)
gives a different plan:

╔═══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╗
║                                                                                                          QUERY PLAN
╟───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╢
║ Update on facttable_imf_ifs p  (cost=5192.25..28394567337.17 rows=7969813 width=1498)
║   ->  Nested Loop  (cost=5192.25..28394567337.17 rows=7969813 width=1498)
║         ->  Hash Join  (cost=1.09..2001519.92 rows=5454160 width=1472)
║               Hash Cond: (c.cleansing_change_type_id = ct.cleansing_change_type_id)
║               ->  Seq Scan on cls_imf_ifs_facttable_imf_ifs c  (cost=0.00..1824258.62 rows=32724962 width=1470)
║               ->  Hash  (cost=1.07..1.07 rows=1 width=8)
║                     ->  Seq Scan on cleansing_change_type ct  (cost=0.00..1.07 rows=1 width=8)
║                           Filter: ((cleansing_change_type_desc_short)::text = 'UPDATED_NEW'::text)
║         ->  Bitmap Heap Scan on facttable_imf_ifs p  (cost=5191.16..5205.65 rows=1 width=51)
║               Recheck Cond: (((macrobondtimeseries)::text = (c.macrobondtimeseries)::text) OR (macrobondtimeseries IS NULL))
║               Filter: ((((macrobondtimeseries)::text = (c.macrobondtimeseries)::text) OR ((macrobondtimeseries IS NULL) AND (c.macrobondtimeseries IS NULL))) AND ((date = c.date) OR ((date IS NULL) AND (c.date IS NULL))))
║               ->  BitmapOr  (cost=5191.16..5191.16 rows=866 width=0)
║                     ->  Bitmap Index Scan on facttable_imf_ifs_pkey  (cost=0.00..7.06 rows=866 width=0)
║                           Index Cond: ((macrobondtimeseries)::text = (c.macrobondtimeseries)::text)
║                     ->  Bitmap Index Scan on facttable_imf_ifs_pkey  (cost=0.00..2.07 rows=1 width=0)
║                           Index Cond: (macrobondtimeseries IS NULL)
╚═══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╝

        hp

-- 
   _  | Peter J. Holzer    | we build much bigger, better disasters now
|_|_) |                    | because we have much more sophisticated
| |   | hjp@xxxxxx         | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>

Attachment: signature.asc
Description: PGP signature


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux