Hello everbody,
I´m doing some tests with a large table about 7 milions tuples.
So, I need to retrieve only the last value for some key. That key has
about 20.000 tuples in this table.
SELECT field1
FROM table_7milions
WHERE field1 = 'my_key'
ORDER BY field1 DESC
LIMIT 1
The statistics tables shows the postgres read about 782656 block from
disk for the table and more 24704 blocks from disk for the index.
A simple select is reading about 98 MB from disk and putting into shared
memory.
So I did some tests like that:
-- I have created a partial table for that key
SELECT *
INTO part_table
FROM table_7milions
WHERE field1 = 'my_key'
-- Now I do the same select on the same 20.000 tuples, but in the
part_table
SELECT field1
FROM part_table
WHERE field1 = 'my_key'
ORDER BY field1 desc
LIMIT 1
Now the statistics shows the postgres read 54016 blocks from disk, only
for the table becouse it doesn´t have a index.
The same select is reading about 6 MB from disk and putting into shared
memory.
I´m thinking It hapens because in the 7 millions tables, the same 8k
block has diferent records with different keys, so only a few records
with 'my_key' is retrieved when I read a 8k block.
In the part_table, all records stored in a 8k block have 'my_key', so
It´s much optimized.
My doubt, there is a way to defrag my 7 millions table to put all
records with the same key in the same 8k block?
How can I do that?
If there is not, I think it´s a good idea for the next versions.
Thank you,
Waldomiro Caraiani
--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance