> - - <loh.law@xxxxxxxxxxx> writes: > > The weird thing is that before I updated my server the query was about 5 times faster. > > I've googled and I think the problem lies with the under-estimation of the query planner about the number of rows in the nested table.I will be trying the 'set enable_seqscan = false' solution to see if that'll improve. > > You evidently already do have that turned off. I'd suggest reverting > that change (ie, allow seqscan) and instead increase work_mem enough > so that the hash join can work without spilling to disk. This query > is a perfect example of where indexes do not help, and trying to force > them to be used makes things slower not faster. > > regards, tom lane I have switched on seqscan and increased work_mem to 1GB ... but no luck so far. The version I'm using is PostgreSQL 8.4.8 on i686-pc-linux-gnu, compiled by GCC gcc-4.5.real (Ubuntu/Linaro 4.5.2-8ubuntu4) 4.5.2, 32-bit Before that I used an earlier minor version (8.4.x - I don't remember what x is but it was the one packaged in the version before Ubuntu Natty). These are the relevant schemas. mid VARCHAR(10) NOT NULL ... ); CREATE TABLE t ( mid VARCHAR(10) NOT NULL PRIMARY KEY ... ); I would like to count rows in q whose mid does not exist in t. This is the query I used. SELECT COUNT(*) FROM q WHERE NOT EXISTS (SELECT 1 FROM t WHERE t.mid = q.mid); Based on my understanding, I believe t
he query will loop through each row in q (which has about 500m rows) and for each row it will check a one-to-one mapping against t (which has about 3m rows) by using an index scan on t (mid). However, the EXPLAIN outputs for seqscan = on and seqscan = off, respectively, seem to indicate that it is not a one-to-one mapping of t.mid and q.mid. I then switched the comparison operator in the where clause as follows: SELECT COUNT(*) FROM q WHERE NOT EXISTS
(SELECT 1 FROM t WHERE q.mid = t.mid); As there is no index on q (mid) this type of query should take a considerably longer time. However, the EXPLAIN outputs seem to be the same. Here they are: With seqscan = on &nbs
p; QUERY PLAN ------------------------------------------------------------------------------------- Aggregate (cost=18566199.92..18566199.93 rows=1 width=0) -> Hash Anti Join (cost=747023.15..18566199.91 rows=1 width=0) Hash Cond: ((q.mid)::text = (t.mid)::text) ->
Seq Scan on q (cost=0.00..11451989.24 rows=565972224 width=10) -> Hash (cost=701775.29..701775.29 rows=3619829 width=10) -> Seq Scan on t (cost=0.00..701775.29 rows=3619829 width=10) (6 rows) With seqscan = off
QUERY PLAN ------------------------------------------------------------------------------------------------------- Aggregate (cost=10024599995.73..10024599995.74 rows=1 width=0) -> Hash Anti Join (cost=10006780818.96..10024599995.72 rows=1 width=0) Hash Cond: ((q.mid)::text = (t.mid)::text) -> Seq Scan on q (cost=1000000
0000.00..10011451989.24 rows=565972224 width=10) -> Hash (cost=6735571.10..6735571.10 rows=3619829 width=10) -> Index Scan using t_pkey on t (cost=0.00..6735571.10 rows=3619829 width=10) (6 rows) Any help is greatly appreciated as this problem has been depressing me for two weeks.
|