In this example it looks to me like the planner is choosing a Seq Scan resulting in 18x running time compared to running it with enable_seqscan = 'off'. Adding more indexes to public.gene (please see below) seemed to make things worse. I definitely have run VACUUM ANALYZE on everything, manually. What am I missing? Thank you for any feedback. Query: SELECT * FROM gene_af_polyphen WHERE dataset_id = '001-1' AND (vartype = 'snp' OR vartype = 'ins' OR vartype = 'del' OR vartype = 'sub'); Query plan: http://explain.depesz.com/s/qnZ Query plan after SET enable_seqscan TO 'off': http://explain.depesz.com/s/N5q Hardware: 24GB memory / 8 core running Linux 2.6.32 x86_64 Database configuration: version | PostgreSQL 9.0.4 on x86_64-unknown-linux-gnu, compiled by GCC gcc (Ubuntu 4.4.3-4ubuntu5) 4.4.3, 64-bit autovacuum | off default_transaction_isolation | serializable effective_cache_size | 18GB lc_collate | en_US.UTF-8 lc_ctype | en_US.UTF-8 listen_addresses | * max_connections | 100 max_stack_depth | 2MB server_encoding | UTF8 shared_buffers | 6GB TimeZone | US/Eastern work_mem | 1GB Table row counts: public.gene ~1 billion (may have lots of NULLs in several columns) public.af 38878319 public.polyphen 25821 Database: Table "public.gene" Column | Type | Modifiers -----------------------+------------------------+----------- dataset_id | character varying(255) | referencename | character varying(255) | index | integer | locus | integer | haplotype | integer | chromosome | character varying(255) | begincoord | integer | endcoord | integer | vartype | character varying(255) | reference | character varying(255) | call | character varying(255) | xref | text | geneid | integer | mrnaacc | character varying(255) | proteinacc | character varying(255) | symbol | character varying(255) | orientation | character(1) | exoncategory | character varying(255) | exon | integer | codingregionknown | character(1) | aacategory | character varying(255) | nucleotidepos | character varying(255) | proteinpos | character varying(255) | aaannot | character varying(255) | aacall | character varying(255) | aaref | character varying(255) | allele | character varying(255) | component | character varying(255) | componentindex | character varying(255) | impact | character varying(255) | annotationrefsequence | character varying(255) | samplesequence | character varying(255) | genomerefsequence | character varying(255) | pfam | character varying(255) | unknown1 | character varying(255) | Indexes: "gene_dataset_id_idx" btree (dataset_id), tablespace "indexspace" Table "public.af" Column | Type | Modifiers -------------+------------------------+----------- chromosome | character varying(255) | not null endcoord | integer | not null rs_id | character varying(255) | reference | character varying(255) | not null call | character varying(255) | not null allele_freq | numeric | Indexes: "af_allele_freq_idx" btree (allele_freq), tablespace "indexspace" "af_call_idx" btree (call), tablespace "indexspace" "af_chromosome_idx" btree (chromosome), tablespace "indexspace" "af_endcoord_idx" btree (endcoord), tablespace "indexspace" "af_reference_idx" btree (reference), tablespace "indexspace" Table "public.polyphen" Column | Type | Modifiers -------------------------+------------------------+----------- mrnaacc | character varying(255) | not null proteinpos | character varying(255) | not null annotationrefsequence | character varying(255) | not null samplesequence | character varying(255) | not null prediction | character varying(255) | probability_deleterious | numeric | Indexes: "polyphen_annotationrefsequence_idx1" btree (annotationrefsequence), tablespace "indexspace" "polyphen_mrnaacc_idx1" btree (mrnaacc), tablespace "indexspace" "polyphen_proteinpos_idx1" btree (proteinpos), tablespace "indexspace" "polyphen_samplesequence_idx1" btree (samplesequence), tablespace "indexspace" CREATE VIEW gene_af_polyphen AS SELECT gene.dataset_id dataset_id, gene.referencename referencename, gene.index "index", gene.locus locus, gene.haplotype haplotype, gene.chromosome chromosome, gene.begincoord begincoord, gene.endcoord endcoord, gene.vartype vartype, gene.reference reference, gene.call call, gene.xref xref, gene.geneid geneid, gene.mrnaacc mrnaacc, gene.proteinacc proteinacc, gene.symbol symbol, gene.orientation orientation, gene.exoncategory exoncategory, gene.exon exon, gene.codingregionknown codingregionknown, gene.aacategory aacategory, gene.nucleotidepos nucleotidepos, gene.proteinpos proteinpos, gene.aaannot aaannot, gene.aacall aacall, gene.aaref aaref, gene.allele allele, gene.component component, gene.componentindex componentindex, gene.impact impact, gene.annotationrefsequence annotationrefsequence, gene.samplesequence samplesequence, gene.genomerefsequence genomerefsequence, gene.pfam pfam, gene.unknown1 unknown1, af.rs_id rs_id, af.allele_freq allele_freq, polyphen.prediction prediction, polyphen.probability_deleterious probability_deleterious FROM gene LEFT JOIN af ON gene.chromosome = af.chromosome AND gene.endcoord = af.endcoord AND gene.reference = af.reference AND gene.call = af.call LEFT JOIN polyphen ON gene.mrnaacc = polyphen.mrnaacc AND gene.proteinpos = polyphen.proteinpos AND gene.annotationrefsequence = polyphen.annotationrefsequence AND gene.samplesequence = polyphen.samplesequence; -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance