Re: Optimizing count(), but Explain estimates wildly off

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> ->  Parallel Seq Scan on pricescope_items  (cost=0.00..1027794.01 rows=3277101 width=522) (actual time=0.753..41654.507 rows=2621681 loops=3)

Why does it take over 41 seconds to read a table with less than
3 million rows?  Are the rows so large?  Is the tabe bloated?
What is the size of the table as measured with pg_relation_size()
and pg_table_size()?
There's one JSON column in each table with a couple fields, and a column with long texts  in Items.

-- pg_table_size, pg_relation_size, pg_indexes_size, rows
nametable_sizerelation_sizeindex_sizerow_estimate
tenders1,775,222,784
1,630,461,9523,815,567
items8,158,773,248
6,052,470,7847,865,043

check_postgres gave a 1.4 bloat score to tenders, 1.9 to items.  I had a duplicate index on transaction_id (one hand made, other from the unique constraint) and other text column indexes with 0.3-0.5 bloat scores.  After Vacuum Full Analyze; sizes are greatly reduced, specially Items:

-- pg_table_size, pg_relation_size, pg_indexes_size, rows
nametable_sizerelation_sizeindex_sizerow_estimate
tenders1,203,445,7601,203,421,184500,482,0483,815,567
items4,436,189,1844,430,790,6562,326,118,4007,865,043

There were a couple mass deletions which probably caused the bloating.  Autovacuum is on defaults,  but I guess it doesn't take care of that. Still, performance seems about the same.

The planner is now using an Index Scan for Colombia without the subselect hack, but subselect takes ~200ms less in avg, so might as well keep doing it.

Row estimate is still +1M so still can't use that, but at least now it takes less than 10s to get the exact count with all countries.

[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux