Re: SELECT INTO large FKyed table is slow

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/30/2010 05:26 PM, Mladen Gogala wrote:
At the beginning of the load, you should defer all of the deferrable
constraints, setting constraints deferred and issuing the copy statement
within a transaction block, like this:

scott=# begin; BEGIN
Time: 0.203 ms
scott=# set constraints all deferred;
SET CONSTRAINTS
Time: 0.201 ms
scott=# copy test1 from '/tmp/test1.csv';
COPY 100
Time: 11.939 ms
scott=# commit;
ERROR: insert or update on table "test1" violates foreign key
constraint "fk_tst1_deptno"
DETAIL: Key (col1)=(1) is not present in table "dept".


Of course, that will require complete rewrite of your load script,
because the errors will be checked at the commit time and transaction
can either fail as a whole or succeed as a whole. It's all or nothing

Well, it is like that now. First I load the data from the CSV into the temporary table (just named temporary, exists on the server). That table is usualy aroun 10k rows. Then I call the function which does the job.

situation. How frequently do you see records with an incorrect drone_id?

Seldom.

If that happens only once in a blue moon, you may need no stinkin'
foreign keys in the first place, you may be able
to have a batch job that will flag all the records with an invalid
drone_id instead.

I did have that idea, yes, but still, I'd like to know what is slowing postgres down. Because when I look at the disk I/O, it seems very random - i get around 800k of disk reads and ocasionaly 1500k of writes (during insert into history table).

Furthermore, you can make sure that you have enough shared buffers to
cache the entire "drones" table. Also, do "strace" on the postgres
process handling your session and see whether the time is spent writing
to WAL archives. If that is slowing you down, you should consider buying
a SSD or a high end disk drive. I have never had such problem, but you
should also check whether pg_loader can do anything for you.

As far as speed is concerned, inserting with deferred foreign keys is
almost as fast as inserting without foreign keys:

scott=# alter table test1 drop constraint fk_tst1_deptno;
ALTER TABLE
Time: 16.219 ms
scott=# copy test1 from '/tmp/test1.csv';
COPY 100
Time: 10.418 ms

If you take a look at the example above, you will see that inserting
with a deferred FK took 11.939 milliseconds while inserting into the
same table without the FK took 10.418 milliseconds, the difference of
1.5 milliseconds per 100 rows. The timing of 2 seconds per 100
rows looks suspiciously high. Me thinks that your problem is not just
the foreign key, there must be something else devouring the time. You
should have a test instance, compiled with "-g" option and do profiling.

I'll have to. So far I've been doing this only on that dedicated server. I'll try to download the database to my desktop and try the tests there.

Concerning the shared_buffers, it's 256M, and the drones table is just 15M.

I have tried your recommendation and it yielded no difference.

Now I tried removing the constraints from the history table (including the PK) and the inserts were fast. After few 'rounds' of inserts I added constraints back, and several round after that were fast again. But then all the same. Insert of some 11k rows took 4 seconds (with all constraints) and now the last one of only 4k rows took one minute. I did vacuum after each insert.


	Mario

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux