> What first post? The only thing I can find is a reference in a message > by you from yesterday, to a two-year old post that you claim is about > the same problem. Though it's possible that it is the same problem, > you don't provide any data to back that up. Strange - you can see the full thread here: http://groups.google.pl/group/pgsql.general/browse_thread/thread/6c4ea8356327276c I post via Google, maybe that's the cause? > The message you referred to was about a one-of-a-kind problem with > communications to the client and had nothing to do with performance on > the server; is that indeed what you're seeing? In that case you should > check your network infrastructure for problems. No, I actually meant that the CPU usage was low during my query. On the other hand IO wait was very high so the low CPU usage was kind of normal. > Usually server performance problems are due to problems with tuning > parameters or outdated statistics. Those issues can usually be solved > easily. Well, maybe - I started to use postgres only recently, so maybe I'm doing some obvious mistakes. My database was filled incrementally by a shell script - I don't know if that alone lets postgres collect all necessary stats. Maybe an implicit analyze is necessary? I also tried to vacuum/vacuum full my tables before running my queries but it took too long so I had to break it. > Without posting an EXPLAIN ANALYSE people here can only guess what > your problem is. I'm posting another "explain analyze" below. I've run my query with "explain analyze", but forgot to save it :( I think I won't be able to run my queries again, because they took around 30-60 minutes and almost killed my server (was almost completely unresponsive during the query, because of 90% IO wait). > > BUT I found the real cause of my problem - the "fk2" field from my > > example had not only an index, but it was also a foreign key to > > another table. > That seems unlikely to be the cause. From the above it seems much more > likely that you're suffering from a bad query plan instead, but you > don't provide any details. I just tried to create a test with similar data - here is what it looks like: (it should work if you just paste it in some Test db) -- ------------- generate test tables + data drop table if exists article, keyword, article_keyword, tochange, sums cascade; CREATE TABLE "article" ( "id" serial NOT NULL PRIMARY KEY, "content" varchar(255) NULL, "ip" inet NULL, "has_comments" bool not null ) ; CREATE TABLE "keyword" ( "id" serial NOT NULL PRIMARY KEY, "keyword" varchar(40) NOT NULL UNIQUE, "articles" integer NOT NULL ) ; CREATE TABLE "article_keyword" ( "id" serial NOT NULL PRIMARY KEY, "article_id" integer NOT NULL REFERENCES "article" ("id") DEFERRABLE INITIALLY DEFERRED, "keyword_id" integer NOT NULL REFERENCES "keyword" ("id") DEFERRABLE INITIALLY DEFERRED, "votes_yes" integer NOT NULL, "votes_no" integer NOT NULL ) ; CREATE INDEX "article_keyword_keyword_id" ON "article_keyword" ("keyword_id"); insert into article(content, ip, has_comments) values ('some article', '123.121.121.223', true); insert into keyword select nextval('keyword_id_seq'), md5(to_char(i, '9999999999999')), 0 from generate_series(1,2000000) as i; insert into article_keyword select nextval('article_keyword_id_seq'), 1, k.id, 0, 0 from generate_series(1,2000000) as i join keyword k on k.keyword=md5(to_char(i, '9999999999999')) join generate_series(1,5) as times on true ; create table tochange ( fromid int not null primary key, toid int not null ); insert into tochange select k1.id, k2.id from generate_series(1,200000) as i join keyword k1 on k1.keyword=md5(to_char(i, '9999999999999')) join keyword k2 on k2.keyword=md5(to_char(i+200000, '9999999999999')) ; create table sums ( id int not null primary key, sum int ); -- ----------------- now my queries: -- replace fromid's with toid's update article_keyword set keyword_id=tc.toid from tochange tc where keyword_id=tc.fromid ; -- delete unused keywords delete from article_keyword where id in ( select k.id from keyword k left join article_keyword ak on k.id=ak.keyword_id where ak.keyword_id is null ) ; -- recalculate sums - in how many articles is a keyword used? insert into sums select keyword_id, count(*) from article_keyword group by keyword_id; update keyword k set articles=s.sum from sums s where k.id=s.id; ---------------------- The problem is that I can't reproduce this slow behaviour with this test case. :( The tables are almost identical - only the article table is bigger in reality (it has around million rows) When I run "explain update" (first update from the test case) it prints this now: Merge Join (cost=5.14..53436.13 rows=3636710 width=26) Merge Cond: (tc.fromid = article_keyword.keyword_id) -> Index Scan using tochange_pkey on tochange tc (cost=0.00..2830.26 rows=100000 width=8) -> Index Scan using article_keyword_keyword_id on article_keyword (cost=0.00..148216.29 rows=5000040 width=26) (4 rows) When I disable enable_mergejoin I have this plan: Hash Join (cost=6160.91..274121.21 rows=5500010 width=26) Hash Cond: (article_keyword.keyword_id = tc.fromid) -> Seq Scan on article_keyword (cost=0.00..87353.10 rows=5500010 width=26) -> Hash (cost=2882.74..2882.74 rows=199774 width=8) -> Seq Scan on tochange tc (cost=0.00..2882.74 rows=199774 width=8) (5 rows) This is the plan I was getting with my original query which took so long. Also the second Update was very slow. Deletes, and inserts were quite fast. > It would also help to know what version of PostgreSQL this is and on > what hardware and setup you're running into this issue. I tried both postgress 8.3, and 8.4. Now I use 8.4. I have a standard config + pgtune which added the following entries: default_statistics_target = 50 maintenance_work_mem = 28MB constraint_exclusion = on checkpoint_completion_target = 0.9 effective_cache_size = 352MB work_mem = 2816kB wal_buffers = 8MB checkpoint_segments = 16 shared_buffers = 112MB max_connections = 80 The server is Intel(R) Core(TM)2 CPU E8400 @3.00GHz, 4GB ram, 2x SATA disks in Raid1 Thanks, MS -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general