Re: Performance die when COPYing to table with bigint PK

Vitalii Tymchyshyn <tivv00@xxxxxxxxx> · Fri, 05 Aug 2011 13:36:38 +0300

05.08.11 11:44, Robert Ayrapetyan написав(ла):
Yes, you are right. Performance become even more awful.
Can some techniques from pg_bulkload be implemented in postgres core?
Current performance is not suitable for any enterprise-wide production system.
BTW: I was thinking this morning about indexes.
How about next feature:
Implement new index type, that will have two "zones" - old & new. New 
zone is of fixed configurable size, say 100 pages (800 K).
Any search goes into both zones. So, as soon as index is larger then 
800K, the search must be done twice.
As soon as new zone hit's it's size limit, part (may be only one?) of 
it's pages are merged with old zone. The merge is "rolling" - if last 
merge've stopped at "X" entry, next merge will start at entry right after X.

As for me, this should greatly resolve large index insert problem:
1) Insert into new zone must be quick because it's small and hot in cache.
2) During merge writes will be grouped because items with near keys (for 
B-tree) or hashes (for hash index) will go to small subset of "old" zone 
pages. In future, merge can be also done by autovacuum in background.
Yes, we get dual index search, but new zone will be hot, so this won't 
make it twice as costly.

Best regards, Vitalii Tymchyshyn

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance