Hi Tom, hi Ludwig, Thanks for your support. Yes, this query has grown very big with time, and I was always asked to add exceptions in it, so the result is quite frightening! >>> TOM: If you try setting enable_bitmapscan off, >>> you'll probably find 8.1 beating 7.4 handily for this query. Correct. I had 239 seconds on the old 7.4 server, and I get 189 seconds with "enable_bitmapscan = off". >>> LUDWIG: What about adding an index to the field >>> etapes_lignes_commandes(code_etape) I have this index already. >>> LUDWIG: What about increasing the settings of work_mem? default work_mem = 1024 ==> 511 seconds work_mem = 2048 ==> 531 seconds >>> TOM: Am I right in guessing that your database is small >>> enough to fit into RAM on the new server? select pg_database_size('groupefpdb'); returns "360428168" That's 360 MB. It fits in RAM, correct! >>> TOM: If so, it would be reasonable to reduce random_page_cost, >>> perhaps all the way to 1.0, and this would probably improve >>> the quality of the planner's choices for you. With that change I get results in 193 seconds. >>> TOM: What might work better is to get rid of the indexes >>> w_code_type_workflow and lw_ordre --- do you have any >>> queries where those are actually useful? Yes, I think I do, but let's try: Drop both indexes VACUUM FREEZE ANALYZE ANALYZE I get the results in 199 seconds. >>> TOM: Another thing you should look at is increasing the >>> cpu-cost parameters. The numbers in your EXPLAIN ANALYZE >>> results suggest that on your new machine the cost of >>> processing an index tuple is about 1/50th of the cost of >>> touching an index page; that is, you ought to have >>> cpu_index_tuple_cost plus cpu_operator_cost around 0.02. >>> I'd try setting each of them to 0.01 and increasing >>> cpu_tuple_cost a little bit, maybe to 0.02. cpu_index_tuple_cost = 0.01 cpu_operator_cost = 0.01 cpu_tuple_cost = 0.02 With these change, plus random_page_cost = 1, I get results in 195 seconds. (Yes, I did restart the server!) The new EXPLAIN ANALYSE at this point is here: Postgresql 8.1.4 with tuning: http://www.attiksystem.ch/explain_analyze_81_2.txt The old EXPLAIN ANALYZE are still here: Postgresql 7.4.5: http://www.attiksystem.ch/explain_analyze_74.txt Postgresql 8.1.4 without tuning: http://www.attiksystem.ch/explain_analyze_81.txt Is there maybe something I could tune further on the kernel side? I get only 20 % improvements with the new server with Linux, compared to the workstation with freebsd... Maybe my query is so CPU-bound that the most important thing is the CPU clock speed, and in both cases I have a single 3Ghz CPU to run the query? What do you think? Philippe -----Message d'origine----- De : Tom Lane [mailto:tgl@xxxxxxxxxxxxx] Envoyé : mercredi, 31. mai 2006 18:21 À : Philippe Lang Cc : pgsql-general@xxxxxxxxxxxxxx Objet : Re: [GENERAL] PGSQL 7.4 -> 8.1 migration & performance problem "Philippe Lang" <philippe.lang@xxxxxxxxxxxxxx> writes: > Here are both EXPLAIN ANALYSE results, plus the query itself: > Postgresql 7.4.5: http://www.attiksystem.ch/explain_analyze_74.txt > Postgresql 8.1.4: http://www.attiksystem.ch/explain_analyze_81.txt > Query is here: http://www.attiksystem.ch/big_query.txt My goodness, that is one big ugly query :-( Trawling through the EXPLAIN ANALYZE results, it seems that the lossage comes from the repeated occurrences of this pattern: SELECT travaux_en_cours_mvu FROM lignes_workflows AS lw INNER JOIN workflows AS w ON lw.id_workflow = w.id WHERE w.code_article = lignes_commandes.code_article AND w.code_type_workflow = commandes.code_type_workflow AND SUBSTRING(lw.code_etape FROM 1 FOR 3) = SUBSTRING(etapes_lignes_commandes.code_etape FROM 1 FOR 3) AND lw.ordre = etapes_lignes_commandes.ordre 7.4 is doing this as -> Nested Loop (cost=0.00..37.28 rows=1 width=8) (actual time=0.056..0.087 rows=1 loops=13653) -> Index Scan using w_code_article on workflows w (cost=0.00..15.76 rows=1 width=4) (actual time=0.016..0.024 rows=1 loops=13653) Index Cond: (code_article = $1) Filter: (code_type_workflow = $2) -> Index Scan using lw_id_workflow on lignes_workflows lw (cost=0.00..21.51 rows=1 width=12) (actual time=0.023..0.036 rows=1 loops=13651) Index Cond: (lw.id_workflow = "outer".id) Filter: (("substring"((code_etape)::text, 1, 3) = "substring"(($3)::text, 1, 3)) AND (ordre = $4)) 8.1 is doing -> Nested Loop (cost=18.93..26.84 rows=1 width=8) (actual time=0.431..0.434 rows=1 loops=13630) -> Bitmap Heap Scan on workflows w (cost=6.63..10.51 rows=1 width=4) (actual time=0.107..0.107 rows=1 loops=13630) Recheck Cond: ((code_article = $1) AND (code_type_workflow = $2)) -> BitmapAnd (cost=6.63..6.63 rows=1 width=0) (actual time=0.104..0.104 rows=0 loops=13630) -> Bitmap Index Scan on w_code_article (cost=0.00..2.02 rows=5 width=0) (actual time=0.017..0.017 rows=5 loops=13630) Index Cond: (code_article = $1) -> Bitmap Index Scan on w_code_type_workflow (cost=0.00..4.36 rows=389 width=0) (actual time=0.083..0.083 rows=390 loops=13628) Index Cond: (code_type_workflow = $2) -> Bitmap Heap Scan on lignes_workflows lw (cost=12.30..16.31 rows=1 width=12) (actual time=0.312..0.313 rows=1 loops=13628) Recheck Cond: ((lw.id_workflow = "outer".id) AND (lw.ordre = $4)) Filter: ("substring"((code_etape)::text, 1, 3) = "substring"(($3)::text, 1, 3)) -> BitmapAnd (cost=12.30..12.30 rows=1 width=0) (actual time=0.306..0.306 rows=0 loops=13628) -> Bitmap Index Scan on lw_id_workflow (cost=0.00..2.02 rows=7 width=0) (actual time=0.009..0.009 rows=7 loops=13628) Index Cond: (lw.id_workflow = "outer".id) -> Bitmap Index Scan on lw_ordre (cost=0.00..10.03 rows=1437 width=0) (actual time=0.293..0.293 rows=1714 loops=13628) Index Cond: (ordre = $4) In the parts of the plan that do not depend on workflows/lignes_workflows joins, 8.1 is consistently beating 7.4. So basically 8.1 is being too optimistic about the value of ANDing multiple indexes. If you try setting enable_bitmapscan off, you'll probably find 8.1 beating 7.4 handily for this query. That's a really blunt-instrument solution of course, and I wouldn't recommend it for production because it'll probably kill performance elsewhere. What might work better is to get rid of the indexes w_code_type_workflow and lw_ordre --- do you have any queries where those are actually useful? Meanwhile, I think I'm going to have to take another look at the bitmap cost estimates ... it shouldn't be bothering to AND a 7-row result with a 1437-row result, even if that does save six trips to the heap. regards, tom lane
Attachment:
smime.p7s
Description: S/MIME cryptographic signature