Search Postgresql Archives

query plan in pg7.4 vs 8.0.3

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I am in process of migrating a database from Pg 7.4.3 to Pg 8.0.3. I dumped and reloaded the data, and things look good, until I try to work with it. When I try to make a query to a view based on a large join (select * from 'view' limit 10), the 7.4.3 query works very well, and the indexes are being used well. But, when I go to 8.0.3, there are some very costly joins, and the query is not workable. (I've put the 'EXPLAIN' results below for the same database, and the same query.) Also, the memory settings have been made the same for 7.4 vs. 8.0.

I should point out that some of the indexes (and joins) are based on the type 'citext'. (http://gborg.postgresql.org/project/citext/ projdisplay.php). I did compile and add this to my Pg 8 installation, but I am concerned that this might be the root problem. There seems to be no update on this work since sometime in 2003, and it might not be appropriate to use in 8.0.3. But, it seems that indexes are being used when I do searches on the individual tables. So, I'm a bit stumped at the moment.

So, I'd like to ask advice on my query, and what might be the cause. If 'citext' is the culprit, how can I migrate a very large database which uses this datatype extensively.

Any help is greatly appreciated.

Thanks,
-albert

===
In Pg 8.0.3--

EXPLAIN select * from snp_quality_overview limit 10;
QUERY PLAN ------------------------------------------------------------------------ ------------------------------------------------------------------------ -----------------
Limit  (cost=83734.45..84167.07 rows=10 width=415)
   ->  Hash Join  (cost=83734.45..10949200.55 rows=251154 width=415)
         Hash Cond: ("outer".panel_id = "inner".panel_id)
-> Nested Loop (cost=83733.39..10924084.09 rows=251154 width=359) -> Nested Loop (cost=83733.39..5685331.85 rows=253240 width=324) -> Hash Join (cost=83733.39..3840930.37 rows=423746 width=291) Hash Cond: ("outer".assay_id = "inner".assay_id) -> Nested Loop (cost=862.47..3680521.72 rows=602829 width=259) -> Hash Join (cost=862.47..258437.41 rows=602829 width=195) Hash Cond: ("outer".genotype_submission_id = "inner".genotype_submission_id) -> Seq Scan on genotype_set gs (cost=0.00..218413.11 rows=6626211 width=117) -> Hash (cost=861.68..861.68 rows=316 width=94) -> Nested Loop (cost=450.77..861.68 rows=316 width=94) Join Filter: ("inner".lsid = "outer".prot_lsid) -> Merge Join (cost=446.02..671.68 rows=358 width=83) Merge Cond: ("outer".batch_file_id = "inner".batch_file_id) -> Index Scan using batch_file_id3 on genotype_submission gsb (cost=0.00..210.81 rows=3472 width=75) -> Sort (cost=446.02..447.31 rows=517 width=20) Sort Key: batch_file.batch_file_id -> Merge Join (cost=23.66..422.72 rows=517 width=20) Merge Cond: ("outer".monthly_batch_id = "inner".monthly_batch_id) Join Filter: ("outer".center_name = "inner".center_name) -> Index Scan using monthly_batch_id on batch_file (cost=0.00..298.59 rows=5024 width=24) -> Sort (cost=23.66..24.02 rows=145 width=19) Sort Key: monthly_batch.monthly_batch_id -> Seq Scan on monthly_batch (cost=0.00..18.45 rows=145 width=19) -> Materialize (cost=4.74..4.97 rows=23 width=72) -> Hash Join (cost=1.10..4.72 rows=23 width=72) Hash Cond: ("outer".genotyping_platform_id = "inner".genotyping_platform_id) -> Seq Scan on protocol p (cost=0.00..3.26 rows=26 width=65) -> Hash (cost=1.08..1.08 rows=8 width=15) -> Seq Scan on genotyping_platform gp (cost=0.00..1.08 rows=8 width=15) -> Index Scan using lsid on assay a (cost=0.00..5.66 rows=1 width=114) Index Cond: (a.lsid = "outer".assay_lsid) -> Hash (cost=56880.53..56880.53 rows=2118953 width=48) -> Seq Scan on assay2rs (cost=0.00..56880.53 rows=2118953 width=48) -> Index Scan using qcresult_genotype_set_id on qcanalysis_result_plate qc_pl (cost=0.00..4.34 rows=1 width=33) Index Cond: (qc_pl.genotype_set_id = "outer".genotype_set_id) -> Index Scan using qcanalysis_result_panel_genotype_set_id_key on qcanalysis_result_panel qc_pn (cost=0.00..20.61 rows=6 width=51) Index Cond: (qc_pn.genotype_set_id = "outer".genotype_set_id)
         ->  Hash  (cost=1.05..1.05 rows=5 width=64)
               ->  Seq Scan on panel  (cost=0.00..1.05 rows=5 width=64)
(42 rows)


====

In Pg 7.4.3--

hapmap_tracking=# explain  select * from snp_quality_overview limit 10;
QUERY PLAN ------------------------------------------------------------------------ ------------------------------------------------------------------------ ------------------------------
Limit  (cost=1.06..303.34 rows=10 width=413)
   ->  Hash Join  (cost=1.06..10120528.33 rows=334810 width=413)
         Hash Cond: ("outer".panel_id = "inner".panel_id)
-> Nested Loop (cost=0.00..10086209.24 rows=334810 width=357) -> Nested Loop (cost=0.00..6685494.20 rows=296445 width=322) -> Nested Loop (cost=0.00..5118344.12 rows=438783 width=289) -> Nested Loop (cost=0.00..2719551.73 rows=625532 width=258) -> Nested Loop (cost=0.00..216951.56 rows=625532 width=194) -> Nested Loop (cost=0.00..59958.71 rows=312 width=94) Join Filter: ("outer".genotyping_platform_id = "inner".genotyping_platform_id) -> Nested Loop (cost=0.00..59549.25 rows=347 width=87) -> Nested Loop (cost=0.00..58308.55 rows=347 width=83) Join Filter: (("inner".monthly_batch_id = "outer".monthly_batch_id) AND ("inner".center_name = "outer".center_name)) -> Seq Scan on monthly_batch (cost=0.00..21.38 rows=138 width=19) -> Merge Join (cost=0.00..372.99 rows=3292 width=87) Merge Cond: ("outer".batch_file_id = "inner".batch_file_id) -> Index Scan using batch_file_id3 on genotype_submission gsb (cost=0.00..175.60 rows=3292 width=75) -> Index Scan using batch_file_pkey on batch_file (cost=0.00..144.36 rows=4768 width=24) -> Index Scan using lsid5 on protocol p (cost=0.00..3.56 rows=1 width=64) Index Cond: (p.lsid = "outer".prot_lsid) -> Seq Scan on genotyping_platform gp (cost=0.00..1.08 rows=8 width=15) -> Index Scan using genotype_submission_id on genotype_set gs (cost=0.00..464.24 rows=3115 width=116) Index Cond: ("outer".genotype_submission_id = gs.genotype_submission_id) -> Index Scan using lsid on assay a (cost=0.00..3.99 rows=1 width=114) Index Cond: (a.lsid = "outer".assay_lsid) -> Index Scan using assay2rs_assay_iduk on assay2rs (cost=0.00..3.82 rows=1 width=47) Index Cond: (assay2rs.assay_id = "outer".assay_id) -> Index Scan using qcresult_genotype_set_id on qcanalysis_result_plate qc_pl (cost=0.00..3.56 rows=1 width=33) Index Cond: (qc_pl.genotype_set_id = "outer".genotype_set_id) -> Index Scan using qcanalysis_result_panel_genotype_set_id_key on qcanalysis_result_panel qc_pn (cost=0.00..11.41 rows=5 width=51) Index Cond: (qc_pn.genotype_set_id = "outer".genotype_set_id)
         ->  Hash  (cost=1.05..1.05 rows=5 width=64)
               ->  Seq Scan on panel  (cost=0.00..1.05 rows=5 width=64)
(33 rows)





---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux