Re: Long Running Update - My Solution

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello Kevin

> If you use EXPLAIN with both statements...

Yes, the plans are indeed very different.

Here is the statement, set to update up to 100,000 records, which took about 5 seconds to complete:


UPDATE
  table_A
SET
  field_1 = table_B.field_1
, field_2 = table_B.field_2
FROM
  table_B
WHERE
  table_B.tb_id >= 0
AND
  table_B.tb_id <= 100000
AND
  table_B.tb_id = table_A.ta_id
;


The query plan for the above is:


Nested Loop  (cost=0.00..2127044.47 rows=73620 width=63)
-> Index Scan using table_B_pkey on table_B (cost=0.00..151830.75 rows=73620 width=20)
        Index Cond: ((tb_id >= 0) AND (tb_id <= 100000))
-> Index Scan using table_A_pkey on table_A (cost=0.00..26.82 rows=1 width=47)
        Index Cond: (table_A.ta_id = table_B.tb_id)


Now, if I change the first AND clause to update 1M records, as follows:


table_B.id <= 1000000


I get the following - quite different - query plan:


Hash Join  (cost=537057.49..8041177.88 rows=852150 width=63)
  Hash Cond: (table_A.ta_id = table_B.tb_id)
  ->  Seq Scan on table_A  (cost=0.00..3294347.71 rows=145561171 width=47)
  ->  Hash  (cost=521411.62..521411.62 rows=852150 width=20)
-> Bitmap Heap Scan on table_B (cost=22454.78..521411.62 rows=852150 width=20)
              Recheck Cond: ((tb_id >= 0) AND (tb_id <= 1000000))
-> Bitmap Index Scan on table_B_pkey (cost=0.00..22241.74 rows=852150 width=0)
                    Index Cond: ((tb_id >= 0) AND (tb_id <= 1000000))


Note: When I tried updating 1M records, the command was still running after 25 minutes before I killed it.

The sequential scan in the later plan looks expensive, and (I think) supports what others have since mentioned, namely that when the optimizer moves to using sequential scans (working off the disk) things get a lot slower.

For me, the penny has finally dropped on why I should use EXPLAIN for bulk operations.

Thanks too, to Greg Smith, Robert Klemme and Thomas for all the feedback.

Kind regards

Harry Mantheakis
London, UK




--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux