2008/1/10, Tom Lane <tgl@xxxxxxxxxxxxx>: > "Scott Marlowe" <scott.marlowe@xxxxxxxxx> writes: > > OK, I was looking at the previous thread that you thought had > > disappeared, and with the explain analyze output from 8.3 I noticed > > something odd. > > > For 8.2 you had something like this: > > > QUERY PLAN > > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ > > Subquery Scan "*SELECT*" (cost=326089.49..350310.28 rows=880756 > > width=20) (actual time=11444.566..13114.365 rows=880691 loops=1) > > -> HashAggregate (cost=326089.49..339300.83 rows=880756 width=12) > > (actual time=11444.554..12438.188 rows=880691 loops=1) > > ... SNIP ... > > Time: 357750.531 ms > > > And for 8.3 you had something like this: > > QUERY PLAN > > ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ > > Subquery Scan "*SELECT*" (cost=316145.48..340289.33 rows=877958 > > width=20) (actual time=10650.036..12997.377 rows=877895 loops=1) > > -> HashAggregate (cost=316145.48..329314.85 rows=877958 width=12) > > (actual time=10650.023..12193.890 rows=877895 loops=1) > > ... SNIP ... > > Time: 9547801.116 ms > > > In both of those instances, the actual time used is WAY larger than > > the time listed in the explain analyze, which has usually pointed to a > > very expensive OS level get time of day call. > > No, that's not it --- these last two runs are on the same OS. > > Remember this command is an INSERT/SELECT. The total actual time for > the SELECT part of it (ie, forming the rows to insert) is what's > reported as the top-level plan node's runtime --- so 8.3 is actually a > shade faster than 8.2 here. And separately from that we can see the > time spent in the one foreign-key trigger, which seems to be slower on > 8.3, but neither of these are where the majority of the runtime is > going. The rest of the runtime must be spent in the physical data > insertion --- storing the rows, updating the indexes, and/or writing > WAL. I don't see any particular reason for that to be so much worse in > 8.3; especially not for such a simple table definition: > http://archives.postgresql.org/pgsql-general/2008-01/msg00266.php > > There's something awfully weird going on there. I can replicate some > slowdown from 8.2 to 8.3, which should be looked into, but it's not the > order-of-magnitude-worse that Clodaldo is getting. I tried this simple > experiment, which should be at least comparable to his table structure: > > create table foo (f1 int, f2 int, f3 int, f4 real); > create index fooi on foo(f1); > create index fooi2 on foo(f2); > explain analyze insert into foo select i,i,0,1.0 from generate_series(1,1000000) i; > > and repeated the last a few times. On 8.2.6 I get numbers like this: > > regression=# explain analyze insert into foo select i,i,0,1.0 from generate_series(1,1000000) i; > QUERY PLAN > ------------------------------------------------------------------------------------------------------------------------------- > Function Scan on generate_series i (cost=0.00..12.50 rows=1000 width=4) (actual time=419.101..1408.868 rows=1000000 loops=1) > Total runtime: 16783.096 ms > (2 rows) > > regression=# explain analyze insert into foo select i,i,0,1.0 from generate_series(1,1000000) i; > QUERY PLAN > ------------------------------------------------------------------------------------------------------------------------------- > Function Scan on generate_series i (cost=0.00..12.50 rows=1000 width=4) (actual time=421.440..1512.787 rows=1000000 loops=1) > Total runtime: 18785.652 ms > (2 rows) > > 8.3RC1 gives numbers more like this: > > regression=# explain analyze insert into foo select i,i,0,1.0 from generate_series(1,1000000) i; > QUERY PLAN > ------------------------------------------------------------------------------------------------------------------------------- > Function Scan on generate_series i (cost=0.00..12.50 rows=1000 width=4) (actual time=420.742..1872.290 rows=1000000 loops=1) > Total runtime: 17278.170 ms > (2 rows) > > regression=# explain analyze insert into foo select i,i,0,1.0 from generate_series(1,1000000) i; > QUERY PLAN > ------------------------------------------------------------------------------------------------------------------------------- > Function Scan on generate_series i (cost=0.00..12.50 rows=1000 width=4) (actual time=435.392..1909.631 rows=1000000 loops=1) > Total runtime: 20673.453 ms > (2 rows) > > This is my usual debugging setup, built --enable-debug --enable-cassert > but with "SET debug_assertions = 0" so that shouldn't affect things too much, > no configuration changes from what initdb gave me except fsync = off > (so it's not too representative of real-world performance, maybe). > > It would be interesting to see the identical test on Clodaldo's > installations. This is 8.2.6 in the new server: cpn=> create table foo (f1 int, f2 int, f3 int, f4 real); CREATE TABLE cpn=> create index fooi on foo(f1); CREATE INDEX cpn=> create index fooi2 on foo(f2); CREATE INDEX cpn=> explain analyze insert into foo select i,i,0,1.0 from generate_series(1,1000000) i; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------ Function Scan on generate_series i (cost=0.00..12.50 rows=1000 width=4) (actual time=270.425..699.067 rows=1000000 loops=1) Total runtime: 12888.913 ms (2 rows) The table into which I'm inserting 800 thousand rows, usuarios, has 135 million rows so I did: cpn=> explain analyze insert into foo select i,i,0,1.0 from generate_series(1,135500000) i; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------- Function Scan on generate_series i (cost=0.00..12.50 rows=1000 width=4) (actual time=49852.161..403976.519 rows=135500000 loops=1) Total runtime: 2044745.294 ms (2 rows) cpn=> analyze; ...warnings... ANALYZE cpn=> explain analyze insert into foo select i,i,0,1.0 from generate_series(135500001, 135500000 + 800000 ) i; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------- Function Scan on generate_series i (cost=0.00..12.50 rows=1000 width=4) (actual time=196.804..553.617 rows=800000 loops=1) Total runtime: 11202.895 ms (2 rows) The real table has foreign keys. Don't it make a difference? I can reinstall 8.3 and try it. Will it be necessary or interesting? beta4 or RC1? Regards, Clodoaldo Pinto Neto ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster