I know this issue has beeen brought up before (since 2000 infact), but lots has changed with postgresql since then. I have followed this issue through versions 7.2, 7.3, and now 7.4. I'm talking about the additional locking involved with foreign keys. For example, assume you have an order detail table that references a status table, and you only have two statuses (open and closed). Transactions that insert, delete, or update the order detail table, will have to wait on each other to complete one at a time, assuming they use the same status (open, for example). This causes a huge performance hit, since it effectively serializes transactions involving tables with foreign keys. Here is how to replicate the problem: 1) Create two tables as follow and populate one with a couple of rows... create table test (id integer primary key); create table test_fk (id integer primary key, fk integer references test (id)); insert into test (id) values ('1'); insert into test (id) values ('2'); 2) Start a transaction and insert a row, but do not commit or rollback... begin; insert into test_fk (id,fk) values ('1','1'); 3) Start another transaction and try to insert a row... begin; insert into test_fk (id,fk) values ('2','1'); The second transaction will hang until the first transaction ends. Looking at the pg_class and pg_locks tables, you get the following output: core=# select c.relname,l.* from pg_locks l left join pg_class c on (l.relation = c.relfilenode) order by c.relname; relname | relation | database | transaction | pid | mode | granted -----------+------------+------------+-------------+-------+------------------+--------- pg_class | 1259 | 2306070060 | | 23222 | AccessShareLock | t pg_locks | 16759 | 2306070060 | | 23222 | AccessShareLock | t test | 2354271522 | 2306070060 | | 23217 | AccessShareLock | t test | 2354271522 | 2306070060 | | 23217 | RowShareLock | t test | 2354271522 | 2306070060 | | 23219 | AccessShareLock | t test | 2354271522 | 2306070060 | | 23219 | RowShareLock | t test_fk | 2354271528 | 2306070060 | | 23219 | AccessShareLock | t test_fk | 2354271528 | 2306070060 | | 23219 | RowExclusiveLock | t test_fk | 2354271528 | 2306070060 | | 23217 | AccessShareLock | t test_fk | 2354271528 | 2306070060 | | 23217 | RowExclusiveLock | t test_pkey | 2354271524 | 2306070060 | | 23219 | AccessShareLock | t | | | 1164421 | 23219 | ExclusiveLock | t | | | 1164425 | 23222 | ExclusiveLock | t | | | 1164408 | 23217 | ExclusiveLock | t | | | 1164408 | 23219 | ShareLock | f (15 rows) core=# select ctid,xmin,xmax,* from test; ctid | xmin | xmax | id -------+---------+---------+---- (0,1) | 1164393 | 1164408 | 1 (0,2) | 1164394 | 0 | 2 (2 rows) As you can see, what is blocking, is the ShareLock on the transaction. After reading through the code, I realized that this is the intended behavior for updates and deletes to the same row. In this case, it's the "select for update" query that's run by postgresql to prevent deletes on the value that the foreign key is referencing, that causes this ShareLock on the transaction. The AccessShareLock on the referenced row will prevent any other transaction from obtaining an ExclusiveLock (needed to delete or update), so there is not really a need to "serialize" these transactions in cases such as this. The code notices that xmax for that tuple is set to a valid transaction id, so it creates a ShareLock on the xmax transaction id (our first transaction) to make the second transaction wait for the first to complete. Since our first transaction is not updating or deleting that row, xmax should not have been updated (the select for update is the culprit). If "select for update" did not update xmax, but still aquired the AccessShareLock, foreign keys would work properly in postgresql (the locks would prevent bad things from happening). I don't know if this would break other functionality, but if so, then it seems it would not be much harder to come up with a way of aquiring the correct locks but not updating xmax. Any thoughts? Will Reese Database Administrator Rackspace Managed Hosting wreese@rackspace.com ---------------------------(end of broadcast)--------------------------- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match