On Thu, 2011-07-07 at 15:34 +0200, vincent dephily wrote: > Hi, > > I have a delete query taking 7.2G of ram (and counting) but I do not > understant why so much memory is necessary. The server has 12G, and > I'm afraid it'll go into swap. Using postgres 8.3.14. > > I'm purging some old data from table t1, which should cascade-delete > referencing rows in t2. Here's an anonymized rundown : > > > # \d t1 > Table "public.t1" > Column | Type | Modifiers > -----------+-----------------------------+--------------------------------- > t1id | integer | not null default > nextval('t1_t1id_seq'::regclass) > (...snip...) > Indexes: > "message_pkey" PRIMARY KEY, btree (id) > (...snip...) > > # \d t2 > Table "public.t2" > Column | Type | Modifiers > -----------------+-----------------------------+----------------------------- > t2id | integer | not null default > nextval('t2_t2id_seq'::regclass) > t1id | integer | not null > foo | integer | not null > bar | timestamp without time zone | not null default now() > Indexes: > "t2_pkey" PRIMARY KEY, btree (t2id) > "t2_bar_key" btree (bar) > "t2_t1id_key" btree (t1id) > Foreign-key constraints: > "t2_t1id_fkey" FOREIGN KEY (t1id) REFERENCES t1(t1id) ON UPDATE > RESTRICT ON DELETE CASCADE > > # explain delete from t1 where t1id in (select t1id from t2 where > foo=0 and bar < '20101101'); > QUERY PLAN > ----------------------------------------------------------------------------- > Nested Loop (cost=5088742.39..6705282.32 rows=30849 width=6) > -> HashAggregate (cost=5088742.39..5089050.88 rows=30849 width=4) > -> Index Scan using t2_bar_key on t2 (cost=0.00..5035501.50 > rows=21296354 width=4) > Index Cond: (bar < '2010-11-01 00:00:00'::timestamp > without time zone) > Filter: (foo = 0) > -> Index Scan using t1_pkey on t1 (cost=0.00..52.38 rows=1 width=10) > Index Cond: (t1.t1id = t2.t1id) > (7 rows) > > > Note that the estimate of 30849 rows is way off : there should be > around 55M rows deleted from t1, and 2-3 times as much from t2. > > When looking at the plan, I can easily imagine that data gets > accumulated below the nestedloop (thus using all that memory), but why > isn't each entry freed once one row has been deleted from t1 ? That > entry isn't going to be found again in t1 or in t2, so why keep it > around ? > > Is there a better way to write this query ? Would postgres 8.4/9.0 > handle things better ? > Do you have any DELETE triggers in t1 and/or t2? -- Guillaume http://blog.guillaume.lelarge.info http://www.dalibo.com -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance