Re: very, very slow performance

"Uwe C. Schroeder" <uwe@xxxxxxxxx> · Fri, 20 Feb 2009 21:45:49 -0800

On Friday 20 February 2009, Tena Sakai wrote:
> Hi Scott,
>
> > What does explain and (it'll take a while to get
> > it) explain analyze select ... have to say?
>
> ---------------------------------------------------------------------------
>---------- Hash Join  (cost=165264.65..55486119.31 rows=601095277 width=32)
> Hash Cond: (genotype.allele1id = a1.alleleid)
>      ->  Hash Join  (cost=82632.33..34731274.54 rows=601095277 width=34)
>            Hash Cond: (genotype.allele2id = a2.alleleid)
>            ->  Seq Scan on genotype  (cost=0.00..13976429.77 rows=601095277
> width=36) ->  Hash  (cost=42474.59..42474.59 rows=2447659 width=6) ->  Seq
> Scan on allele a2  (cost=0.00..42474.59 rows=2447659 width=6) ->  Hash 
> (cost=42474.59..42474.59 rows=2447659 width=6)
>            ->  Seq Scan on allele a1  (cost=0.00..42474.59 rows=2447659
> width=6) (9 rows)

The above tells you that you don't have indices in place. Postgres chooses a 
seq scan - which as the name implies scans all the rows in sequencial order.

I'd add an index on genotype.allele1id and genotype.allele2id
aka
create index gtallele1idx on genotype (allele1id);

create index gtallele2idx on genotype (allele2id);

and also on allele.alleleid:
create index alleleididx on allele (alleleid);

After a "analyze genotype" and "analyze allele" the query should perform much 
better. The explain analyze should show you an index scan instead of the seq 
scan after that.

HTH

Uwe

-- 
Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin