I wrote: > I tried this locally, and what I see happening is that a checkpoint > process starts shortly after the CREATE INDEX begins whomping out the > index data --- but it doesn't finish until after the CREATE INDEX does. > AFAICS there is not any sort of locking problem, I have to take that back: there definitely is a locking problem. Perhaps there is an I/O bandwidth issue too. What I see happening on closer analysis is that btree CREATE INDEX can hold "exclusive context lock" on some shared buffers for significant periods of time. It tries to write all the levels of the btree in parallel, so it is spitting out level-zero pages at a great rate, level-one pages at a lesser rate, etc. For a large index there could be many btree levels, and pages in the higher levels will be held locked in the shared buffer arena for considerable periods. CHECKPOINT scans the shared buffer arena and tries to write every dirty page it finds. This requires getting shared context lock, and so will be blocked by the lock CREATE INDEX is holding. I am toying with the idea that CREATE INDEX shouldn't use the shared buffer manager at all; there is no need for other backends to touch the index until the creating transaction commits. We'd need to be able to fsync the index file before committing. That isn't part of the smgr API right now, but could easily be added. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 8: explain analyze is your friend