> > On Mon, 11/3/14, Igor Neyman <ineyman@xxxxxxxxxxxxxx> wrote: > > -----Original Message----- > From: pgsql-general-owner@xxxxxxxxxxxxxx > [mailto:pgsql-general-owner@xxxxxxxxxxxxxx] > On Behalf Of pbj@xxxxxxxxxx > Sent: Monday, November 03, 2014 11:34 AM > To: pgsql-general@xxxxxxxxxxxxxx > Subject: [SPAM] - Performance of UPDATE SET = FROM > vs UPDATE SET = (SELECT ...) > > Why does the UPDATE SET = FROM choose a more poorly > performing plan than the UPDATE SET = (SELECT ...)? It > seems to me that it is the same join. > > I'm using 9.3.5. > > CREATE TABLE orig > ( > key1 VARCHAR(11) > PRIMARY KEY, > time1 TIME > ); > > INSERT INTO orig (key1, time1) > SELECT > a::TEXT, > (((random()*100)::INT % > 24)::TEXT || ':' || > ((random()*100)::INT % > 60)::TEXT)::TIME FROM generate_series(80000000000, > 80002000000) a; > > CREATE INDEX odx ON orig(key1); > > CREATE TABLE second (LIKE orig); > > INSERT INTO second (key1) > SELECT > (80000000000+(((random()*1000000)::INT) % 1000000))::TEXT > FROM generate_series(1,400000); > > EXPLAIN ANALYZE > UPDATE second SET time1 = orig.time1 > FROM orig > WHERE second.key1 = orig.key1; > [.....] > > UPDATE second SET time1 = NULL; > > EXPLAIN ANALYZE > UPDATE second SET time1 = (SELECT orig.time1 FROM > orig,second > > WHERE orig.key1 = second.key1 > LIMIT 1); > [.....] > > These 2 queries are not the same. > > The first query updates rows in the "second" table with the > orig.time1 values based on key1 column match. > The second query finds first possible match (based on key1 > column) and assigns orig.time1 value from the matched row to > every record in "second" table. > > Regards, > Igor Neyman I see that now. I was trying to reproduce something from work from memory and got tripped up on a sublety of UPDATE ... SELECT. The query I ran at work was like this: EXPLAIN ANALYZE UPDATE second se SET time1 = (SELECT time1 FROM orig WHERE orig.key1 = se.key1); QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------ Update on second se (cost=0.00..3390627.00 rows=400000 width=18) (actual time=18698.795..18698.795 rows=0 loops=1) -> Seq Scan on second se (cost=0.00..3390627.00 rows=400000 width=18) (actual time=7.558..16694.600 rows=400000 loops=1) SubPlan 1 -> Index Scan using odx on orig (cost=0.43..8.45 rows=1 width=8) (actual time=0.033..0.035 rows=1 loops=400000) Index Cond: ((key1)::text = (se.key1)::text) Total runtime: 18698.865 ms (6 rows) This does correctly match and update all of the second table entries. The plan actually runs longer than the UPDATE ... FROM, which squares with a comment the fine manual. Thanks! PJ -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general