On 08/21/2014 08:41 AM, Victor d'Agostino wrote:
UPDATE MYBIGTABLE SET date = (SELECT date FROM INDEXEDTABLE WHERE INDEXEDTABLE.email_id=MYBIGTABLE.email_id) WHERE date is null;
I may be wrong here, but wouldn't this style of query force a nested loop? Over several million rows, that would take an extremely long time. You might want to try this syntax instead:
UPDATE MYBIGTABLE big SET date = idx.date FROM INDEXEDTABLE idx WHERE idx.email_id = big.email_id AND big.date IS NULL;
This transaction is still running and will end in several days. It only uses 1 core.
That's not your problem. I suspect if you checked your RAID IO, you'd see 100% IO utilization because instead of a sequence scan, it's performing a random seek for every update.
My question is : Can I add new records in the table or will it generate locks ?
Your update statement will only lock the rows being updated. You should be able to add new rows, but with the IO consuming your RAID, you'll probably see significant write delays that resemble lock waits.
-- Shaun Thomas OptionsHouse, LLC | 141 W. Jackson Blvd. | Suite 800 | Chicago IL, 60604 312-676-8870 sthomas@xxxxxxxxxxxxxxxx ______________________________________________ See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general