> -> Parallel Seq Scan on pricescope_items (cost=0.00..1027794.01 rows=3277101 width=522) (actual time=0.753..41654.507 rows=2621681 loops=3)
Why does it take over 41 seconds to read a table with less than
3 million rows? Are the rows so large? Is the tabe bloated?
What is the size of the table as measured with pg_relation_size()
and pg_table_size()?
There's one JSON column in each table with a couple fields, and a column with long texts in Items.
-- pg_table_size, pg_relation_size, pg_indexes_size, rows | ||||
---|---|---|---|---|
name | table_size | relation_size | index_size | row_estimate |
tenders | 1,775,222,784 | 1,630,461,952 | 3,815,567 | |
items | 8,158,773,248 | 6,052,470,784 | 7,865,043 |
check_postgres gave a 1.4 bloat score to tenders, 1.9 to items. I had a duplicate index on transaction_id (one hand made, other from the unique constraint) and other text column indexes with 0.3-0.5 bloat scores. After Vacuum Full Analyze; sizes are greatly reduced, specially Items:
-- pg_table_size, pg_relation_size, pg_indexes_size, rows | ||||
---|---|---|---|---|
name | table_size | relation_size | index_size | row_estimate |
tenders | 1,203,445,760 | 1,203,421,184 | 500,482,048 | 3,815,567 |
items | 4,436,189,184 | 4,430,790,656 | 2,326,118,400 | 7,865,043 |
There were a couple mass deletions which probably caused the bloating.
Autovacuum is on defaults, but I guess it doesn't take care of that. Still, performance seems about the same.
The planner is now using an Index Scan for Colombia without the subselect hack, but subselect takes ~200ms less in avg, so might as well keep doing it.
Row estimate is still +1M so still can't use that, but at least now it takes less than 10s to get the exact count with all countries.