On Fri, Feb 20, 2009 at 10:45 PM, Uwe C. Schroeder <uwe@xxxxxxxxx> wrote: > > On Friday 20 February 2009, Tena Sakai wrote: >> Hi Scott, >> >> > What does explain and (it'll take a while to get >> > it) explain analyze select ... have to say? >> >> --------------------------------------------------------------------------- >>---------- Hash Join (cost=165264.65..55486119.31 rows=601095277 width=32) >> Hash Cond: (genotype.allele1id = a1.alleleid) >> -> Hash Join (cost=82632.33..34731274.54 rows=601095277 width=34) >> Hash Cond: (genotype.allele2id = a2.alleleid) >> -> Seq Scan on genotype (cost=0.00..13976429.77 rows=601095277 >> width=36) -> Hash (cost=42474.59..42474.59 rows=2447659 width=6) -> Seq >> Scan on allele a2 (cost=0.00..42474.59 rows=2447659 width=6) -> Hash >> (cost=42474.59..42474.59 rows=2447659 width=6) >> -> Seq Scan on allele a1 (cost=0.00..42474.59 rows=2447659 >> width=6) (9 rows) I was wrong about this query, it is constrained by the where clause. I much prefer join on syntax as it's more obvious what's joining to what. Pgsql is smart enough to reorder join clauses as long as it's not contrained in by say, a left join, and even then there's some wiggle room I think. Anyway... > The above tells you that you don't have indices in place. Postgres chooses a > seq scan - which as the name implies scans all the rows in sequencial order. Well, he's grabbing everything, so I'm betting an index won't buy you anything unless everything fits in memory and you set random_page_cost low enough and shared_buffers and effective_cache high enough, then an index will lose. However, if you always access the tables in a given order, you can cluster tables and get really fast results. I'd try clustering on an index for each sub table, clustering on that, and adding order bys to put the result sets into matching clustered index fields for each joined table. -- Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-admin