2008/1/4, Tom Lane <tgl@xxxxxxxxxxxxx>: > Clodoaldo <clodoaldo.pinto.neto@xxxxxxxxx> writes: > > The same insert query takes 20 minutes in the production system and 2 > > hours in the new one. > > Hmph. It's the same plan, so it's not a planner mistake. Could you > post EXPLAIN ANALYZE rather than just EXPLAIN for both cases? New server's explain analyze now with debug_assertions = false as suggested by Greg: fahstats=> explain analyze fahstats-> insert into usuarios ( fahstats(> data, fahstats(> usuario, fahstats(> pontos, fahstats(> wus fahstats(> ) fahstats-> select fahstats-> (select data_serial from data_serial) as data, fahstats-> ui.usuario_serial as usuario, fahstats-> sum(pontos) as pontos, fahstats-> sum(wus) as wus fahstats-> from usuarios_temp as ut inner join usuarios_indice as ui fahstats-> on ut.usuario = ui.usuario_nome and ut.n_time = ui.n_time fahstats-> group by data, ui.usuario_serial fahstats-> ; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Subquery Scan "*SELECT*" (cost=323715.27..347866.73 rows=878235 width=20) (actual time=11072.973..13236.655 rows=878022 loops=1) -> HashAggregate (cost=323715.27..336888.79 rows=878235 width=12) (actual time=11072.960..12494.611 rows=878022 loops=1) InitPlan -> Seq Scan on data_serial (cost=0.00..1.01 rows=1 width=4) (actual time=0.008..0.009 rows=1 loops=1) -> Merge Join (cost=101826.80..263869.07 rows=5984519 width=12) (actual time=4515.658..10039.560 rows=883856 loops=1) Merge Cond: ((ut.n_time = ui.n_time) AND ((ut.usuario)::text = (ui.usuario_nome)::text)) -> Index Scan using usuarios_temp_ndx on usuarios_temp ut (cost=0.00..52895.92 rows=883856 width=23) (actual time=0.097..2188.894 rows=883856 loops=1) -> Sort (cost=101826.80..104022.38 rows=878235 width=19) (actual time=4515.538..4846.120 rows=883896 loops=1) Sort Key: ui.n_time, ui.usuario_nome Sort Method: quicksort Memory: 90130kB -> Seq Scan on usuarios_indice ui (cost=0.00..15126.35 rows=878235 width=19) (actual time=0.022..300.928 rows=878062 loops=1) Trigger for constraint datas: time=21853.261 calls=878022 Total runtime: 6889463.591 ms (13 rows) Time: 6889813.327 ms Same explain analyze in the production server: Subquery Scan "*SELECT*" (cost=317921.51..342084.63 rows=878659 width=20) (actual time=32184.729..37067.875 rows=878645 loops=1) -> HashAggregate (cost=317921.51..331101.40 rows=878659 width=12) (actual time=32184.714..34808.398 rows=878645 loops=1) InitPlan -> Seq Scan on data_serial (cost=0.00..1.01 rows=1 width=4) (actual time=0.015..0.017 rows=1 loops=1) -> Merge Join (cost=102030.95..261051.62 rows=5686888 width=12) (actual time=12060.037..29542.750 rows=884482 loops=1) Merge Cond: ((ut.n_time = ui.n_time) AND ((ut.usuario)::text = "inner"."?column4?")) -> Index Scan using usuarios_temp_ndx on usuarios_temp ut (cost=0.00..55089.16 rows=884482 width=26) (actual time=0.196..8751.167 rows=884482 loops=1) -> Sort (cost=102030.95..104227.60 rows=878659 width=22) (actual time=12059.789..13205.930 rows=884522 loops=1) Sort Key: ui.n_time, (ui.usuario_nome)::text -> Seq Scan on usuarios_indice ui (cost=0.00..15285.59 rows=878659 width=22) (actual time=0.051..1375.554 rows=878685 loops=1) Trigger for constraint datas: time=34157.875 calls=878645 Total runtime: 808956.038 ms (12 rows) Time: 809219.070 ms > Also, it'd be worth watching "vmstat 1" output while the queries are > running on each system, to see if there's any obvious discrepancy > in behavior at that level. # vmstat 1 in new system while running the insert query: procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------ r b swpd free buff cache si so bi bo in cs us sy id wa st 0 8 64 25320 58176 3342492 0 0 156 355 27 88 1 0 88 11 0 0 8 64 25344 58176 3342492 0 0 0 0 168 212 0 1 0 99 0 0 8 64 25344 58176 3342492 0 0 0 0 143 185 0 1 0 99 0 0 8 64 25468 58176 3342492 0 0 0 0 166 189 0 1 0 99 0 0 8 64 25568 58176 3342492 0 0 0 0 159 210 0 1 0 99 0 0 8 64 25708 58176 3342492 0 0 0 0 151 185 0 1 0 99 0 Some minutes later in the same system while still running the insert query: # vmstat 1 procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------ r b swpd free buff cache si so bi bo in cs us sy id wa st 0 5 64 25060 58352 3348704 0 0 156 356 27 88 1 0 87 11 0 0 5 64 24988 58352 3348732 0 0 48 0 113 156 1 1 50 49 0 0 5 64 24920 58352 3348808 0 0 48 0 116 150 0 0 50 50 0 0 6 64 24920 58352 3348884 0 0 64 0 146 163 0 0 41 58 0 0 6 64 24860 58352 3348964 0 0 56 0 136 204 0 0 0 100 0 0 6 64 24800 58352 3349044 0 0 56 0 128 194 0 0 0 99 0 About 40 minutes later still running the query: # vmstat 1 # new system procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------ r b swpd free buff cache si so bi bo in cs us sy id wa st 0 4 64 25700 60492 3334264 0 0 155 358 28 88 1 0 87 12 0 0 4 64 25672 60492 3334292 0 0 80 0 167 160 0 0 0 100 0 0 4 64 25672 60492 3334372 0 0 72 0 157 157 0 0 0 99 0 0 2 64 25544 60492 3334448 0 0 72 0 175 170 0 0 0 100 0 0 1 64 25224 60500 3334732 0 0 248 100 143 278 0 0 11 88 0 0 1 64 25024 60500 3334900 0 0 128 8 23 168 0 0 50 50 0 0 3 64 24956 60500 3335028 0 0 124 3336 264 168 0 0 46 53 0 0 2 64 24896 60500 3335112 0 0 80 0 180 168 0 0 50 50 0 0 1 64 24672 60508 3335360 0 0 168 48 35 220 0 0 50 49 0 0 1 64 24332 60508 3335772 0 0 452 104 76 400 0 0 50 49 0 0 3 64 25132 60508 3334556 0 0 652 7992 255 534 0 1 36 63 0 0 3 64 24996 60508 3334688 0 0 64 0 154 126 0 0 0 100 0 # vmstat 1 -p /dev/sdb3 # new system - don't know it this has any relevance. sdb3 reads read sectors writes requested writes 1643755 81073154 23523967 188191736 1643763 81073282 23524094 188192752 1643783 81073602 23524106 188192848 1643790 81073714 23524106 188192848 1643882 81075178 23524144 188193152 1643933 81075986 23524908 188199264 1643943 81076138 23525230 188201840 Production system while running the insert query: # vmstat 1 procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------ r b swpd free buff cache si so bi bo in cs us sy id wa st 1 0 1116 76224 16196 1704500 0 0 228 869 8 11 12 2 72 14 0 1 0 1116 71264 16196 1704500 0 0 0 0 1009 32 59 41 0 0 0 1 0 1116 66304 16196 1704500 0 0 0 0 1009 23 56 44 0 0 0 1 0 1116 60984 16196 1704500 0 0 0 0 1009 27 59 41 0 0 0 1 0 1116 55904 16204 1704492 0 0 0 12 1011 31 60 40 0 0 0 1 0 1116 50424 16204 1704500 0 0 0 0 1009 29 66 34 0 0 0 1 0 1116 50848 16204 1699168 0 0 0 8 1015 37 84 16 0 0 0 A few minutes later: # vmstat 1 # production system procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------ r b swpd free buff cache si so bi bo in cs us sy id wa st 2 1 1108 50832 10452 1670888 0 0 229 869 9 11 12 2 72 14 0 2 3 1108 52232 10448 1668680 0 0 5820 36556 1478 1094 59 30 0 11 0 2 7 1108 50908 10460 1670172 0 0 4476 8416 1417 1126 10 8 0 82 0 0 8 1108 52420 10392 1668368 0 0 6460 10568 1509 1465 8 8 0 84 0 0 2 1108 50528 10400 1671140 0 0 4304 10188 1355 973 5 8 0 87 0 0 2 1108 51848 10416 1669616 0 0 552 32 1153 399 0 2 0 98 0 I forgot to say that the raid 1 is software raid. Regards, Clodoaldo Pinto Neto ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster