After more testing I have gained some insights: The foreign key constraints are NOT responsible for the low COPY FROM performance in my case. I forgot about the indexes which are created along with the FK constraints. Besides the primary key CONSTRAINT obj_item_loc_pkey PRIMARY KEY (obj_item_id, loc_id, obj_item_loc_ix), the table OBJ_ITEM_LOC has four additional indexes (let's call them idx_1 through idx_4) CREATE INDEX idx_1 ON obj_item_loc USING btree (rec_id); CREATE INDEX idx_2 ON obj_item_loc USING btree (loc_id); CREATE INDEX idx_3 ON obj_item_loc USING btree (rptd_id); CREATE INDEX idx_4 ON obj_item_loc USING btree (obj_item_id); The indexes 2 to 4 are intended to speed up joins between OBJ_ITEM_LOC and LOC (loc_id), RPTD (rptd_id) and OBJ_ITEM (obj_item) respectively (and I'm highly suspicious if this makes sense at all.) idx_4 together with a simple select in the tables on-insert trigger is slowing things down considerably. With idx_4 and the trigger rates are 44100 rows, 0:00:04.576, 9637 r/s: LOC 2101 rows, 0:00:00.221, 9506 r/s: OBJ_ITEM 2101 rows, 0:00:00.278, 7557 r/s: ORG 94713 rows, 0:00:18.502, 5119 r/s: RPTD 44100 rows, 0:03:03.437, 240 r/s: OBJ_ITEM_LOC imported 187115 record in 0:03:27.081 => 903 r/s pg_statio comes up with same big numbers (reads = bad, hits = not so bad?): relname | heap_blks_read | heap_blks_hit | idx_blks_read | idx_blks_hit --------------+----------------+---------------+---------------+-------------- obj_item_loc | 1262 | 9908013 | 1199 | 1682005 rptd | 4434 | 279022 | 1806 | 1270746 org | 38 | 191559 | 19 | 201071 obj_item | 84 | 92476 | 29 | 104134 loc | 768 | 88902 | 597 | 352680 (5 rows) Dropping idx_1, idx_2 and idx_3 at the same time has no significant impact. But take away idx_4 only: 44100 rows, 0:00:04.558, 9675 r/s: LOC 2101 rows, 0:00:00.220, 9593 r/s: OBJ_ITEM 2101 rows, 0:00:00.275, 7640 r/s: ORG 94713 rows, 0:00:18.407, 5145 r/s: RPTD 44100 rows, 0:00:11.433, 3857 r/s: OBJ_ITEM_LOC imported 187115 record in 0:00:34.938 => 5355 r/s Hm, not bad. Now for the select statement in the on insert trigger: SELECT * FROM obj_item_loc WHERE obj_item_loc.obj_item_id = NEW.obj_item_id AND obj_item_loc.loc_id = NEW.loc_id AND obj_item_loc.obj_item_loc_ix = NEW.obj_item_loc_ix INTO old; Executing this query AFTER the bulk insert (and probably some auto-vacuuming) the query plan looks like this explain analyze select * from obj_item_loc where (obj_item_id, loc_id, obj_item_loc_ix) = (10903011224100014650,10903010224100089226,10900024100000140894) QUERY PLAN -------------- Index Scan using obj_item_loc_loc_id_idx on obj_item_loc (cost=0.00..8.36 rows=1 width=329) (actual time=0.039..0.040 rows=1 loops=1) Index Cond: (loc_id = 10903010224100089226::numeric) Filter: ((obj_item_id = 10903011224100014650::numeric) AND (obj_item_loc_ix = 10900024100000140894::numeric)) Total runtime: 0.079 ms After some head-scratching I realized that obj_item_id is just referencing a meager 2101 rows which probably makes not for a good index candidate. So, the query plan make some sense, I guess. Now I have some (more) questions: 1. How do I know which index (if any) is chosen for a select statement inside a trigger during a bulk load transaction? (or for that matter: a series of recursive plpgsql functions) 2. The query planner depends on stats collected by auto-vacuum/vacuum analyze, right? Does stats collecting also happen during a lengthy transaction? 3. Is it possible (or even advisable) to trigger vacuum analyze inside an ongoing transaction. Let's say load 10,000 rows of table A, analyze table A, insert the next 10,000 rows, analyze again, ... I'm sorry if this is basic stuff I'm asking here, but especially point 2 is bothering me. -- Kind regards Horst Dehmer On 12.01.2013, at 01:17, Jeff Janes <jeff.janes@xxxxxxxxx> wrote: On Friday, January 11, 2013, Horst Dehmer wrote: |