Re: Long Running Update

Harry Mantheakis <harry.mantheakis@xxxxxxxxxxxxxxxxxxxxxx> · Fri, 24 Jun 2011 12:16:32 +0100

Thank you Kevin.

> SET field_1 = table_B.field_1, field_2 = table_B.field_2

I will try that, if I have to next time.

> add something like this toyour WHERE clause,
> to prevent unnecessary updates:
>
>   AND (table_B.field_1 IS DISTINCT FROM table_A.field_1
>    OR table_B.field_2 IS DISTINCT FROM table_A.field_2);

Thank you for that explanation - I will keep that in mind in future. (In 
this case, the two fields that are being updated are all known to be 
empty - hence, distinct - in the target table.)

> EXPLAIN the statement

Here is the EXPLAIN result:

----------------------------------------------------------------------
QUERY PLAN
----------------------------------------------------------------------
Hash Join  (cost=2589312.08..16596998.47 rows=74558048 width=63)
Hash Cond: (table_A.id = table_B.id)
->  Seq Scan on table_A(cost=0.00..1941825.05 rows=95612705 width=47)
->  Hash  (cost=1220472.48..1220472.48 rows=74558048 width=20)
->  Seq Scan on table_B(cost=0.00..1220472.48 rows=74558048 width=20)
----------------------------------------------------------------------

The documentation says the 'cost' numbers are 'units of disk page fetches'.

Do you, by any chance, have any notion of how many disk page fetches can 
be processed per second in practice - at least a rough idea?

IOW how do I convert - guesstimate! - these numbers into (plausible) 
time values?

Kind regards

Harry Mantheakis
London, UK

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance