On 16/07/12 18:56, Jon Nelson wrote:
My explanation would apply to many databases, not just Postgres.It is not my intent to insult or even disparage my favorite software, but it took less time to *build* the indices for 550GB of data than it would have to insert 1/20th as much. That doesn't seem right. To speed up the insert there are a number of possible approaches: 1. Partition the data and then sort the temporary table into groups based on the partitioning. Best of all if all the new data goes into a single partition. 2. Drop the indexes before insert and rebuild afterwards. 3. Reduce the number of indexes. If you only have one index, you can sort the data to be inserted in the natural order of the index. If you must have more than one index you could still sort the new data in the order of one of them to obtain a modest improvement in locality. 4. The most efficient way for the database itself to do the updates would be to first insert all the data in the table, and then update each index in turn having first sorted the inserted keys in the appropriate order for that index. Mark |