On 7/15/09 4:56 PM, "Devin Ben-Hur" <dbenhur@xxxxxxxxxxxxxx> wrote: > Marc Cousin wrote: >> This mail contains the asked plans : >> Plan 1 >> around 1 million records to insert, seq_page_cost 1, random_page_cost 4 > >> -> Hash (cost=425486.72..425486.72 rows=16746972 width=92) (actual >> time=23184.196..23184.196 rows=16732049 loops=1) >> -> Seq Scan on path (cost=0.00..425486.72 rows=16746972 >> width=92) (actual time=0.004..7318.850 rows=16732049 loops=1) > >> -> Hash (cost=1436976.15..1436976.15 rows=79104615 width=35) (actual >> time=210831.840..210831.840 rows=79094418 loops=1) >> -> Seq Scan on filename (cost=0.00..1436976.15 rows=79104615 >> width=35) (actual time=46.324..148887.662 rows=79094418 loops=1) > > This doesn't address the cost driving plan question, but I think it's a > bit puzzling that a seq scan of 17M 92-byte rows completes in 7 secs, > while a seqscan of 79M 35-byte rows takes 149secs. It's about 4:1 row > ratio, less than 2:1 byte ratio, but a 20:1 time ratio. Perhaps there's > some terrible bloat on filename that's not present on path? If that seq > scan time on filename were proportionate to path this plan would > complete about two minutes faster (making it only 6 times slower instead > of 9 :). Bloat is possible. This can be checked with VACUUM VERBOSE on the table. Postgres has a habit of getting its table files fragmented too under certain use cases. Additionally, some of the table pages may have been cached in one use case and not in another. > > -- > -Devin > > -- > Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-performance > -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance