Re: very very slow inserts into very large table

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 16/07/12 14:37, Jon Nelson wrote:
I have a single *table* that is some 560GB in size, 6 columns, average
row width 63.
There are approximately 6.1 billion rows.
It has 4 indices, 131, 313, 131 and 190 GB in size, respectively. All
are btree indices.

I tried inserting new data into the table, and it's taking a *very* long time.
I pre-built the data to be inserted into a temporary table with the
exact same structure and column ordering, etc, and the temporary table
is about 8.5GB in size with about 93 million rows.
The temporary table was built in about 95 seconds.
The insert has been going for 47 hours and 21 minutes, give or take.
I'm not doing any correlation or filtering, etc --  straight up
insert, literally "insert into big_table select * from
the_temp_table;".

vmstat output doesn't seem that useful, with disk wait being 10-15%
and I/O speeds highly variable, from 5-20MB/s reads couple with
0-16MB/s writes, generally on the lower end of these.
strace of the inserting process shows that it's basically hammering
the disk in terms of random reads and infrequent writes.
postgresql. It's not verifying, rebuilding, etc. While this process is
active, streaming write I/O is terrible - 36MB/s. WIth it "paused"
(via strace) I get 72MB/s.  (reads are 350MB/s).

The OS is Scientific Linux 6.2, and the version of postgresql is 9.1.4
- x86_64. There is nothing else of note happening on the box. The box
is a quad CPU, dual-core each Xeon E5430  @ 2.66GHz with 32GB of RAM
and a 3ware 9690 RAID 4TB RAID10 for the storage for

What might be going on here?


Every insert updates four indexes, so at least 3 of those will be in random order. The indexes don't fit in memory, so all those updates will involve reading most of the relevant b-tree pages from disk (or at least the leaf level). A total of 10ms of random read from disk (per inserted row) wouldn't surprise me ... which adds up to more than 10 days for your 93 million rows.

Mark Thornton

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux