On Tue, Feb 16, 2016 at 6:04 PM, Tom Lane <tgl@xxxxxxxxxxxxx> wrote:
I wrote:
> Chris Travers <chris.travers@xxxxxxxxx> writes:
>> Unless there is a sense that this is a bad idea I will submit a doc patch.
> I was already working on it ... I think what we want is something along
> this line in the "Building Indexes Concurrently" section of the CREATE
> INDEX ref page:
After further perusal of the code I propose replacing that para with this
wording:
<para>
In a concurrent index build, the index is actually entered into
the system catalogs in one transaction, then two table scans occur in
two more transactions. Before each table scan, the index build must
wait for existing transactions that have modified the table to terminate.
After the second scan, the index build must wait for any transactions
that have a snapshot (see <xref linkend="mvcc">) predating the second
scan to terminate. Then finally the index can be marked ready for use,
and the <command>CREATE INDEX</> command terminates.
Even then, however, the index may not be immediately usable for queries:
in the worst case, it cannot be used as long as transactions exist that
predate the start of the index build.
</para>
This is a good deal clearer, IMO, about the conditions under which
transactions block CREATE INDEX CONCURRENTLY; we need that because
otherwise it's not going to make much sense to talk about old
transactions still existing after the command completes.
Agreed.
The reason for the "worst case" weasel-wording is that the problem you
saw doesn't actually occur unless the index build detected some broken
HOT chains. I do not want to get into explaining what those are here,
so it seemed best to just be vague about whether there's a delay in
index usability or not.
I think that is also a good optimization, documentation wise.
(BTW, I wondered whether this wasn't just a bug and we should make things
less confusing by having CREATE INDEX CONCURRENTLY not complete until
the index is fully usable. However, it appears the reason we don't do
that is it would create a risk of two CREATE INDEX CONCURRENTLY commands
deadlocking, ie they'd each think they have to wait for the other one.)
I think even without the deadlocking that would be far worse than the current behavior. The current behavior is a bit opaque when it happens (and in this case I could certainly see HOT chains being a problem case on this db as it is large, but also that this specific table has tremendous turnover, and a few transactions which read from the table can be extremely long running -- have shortened the longest running case from about 4 days to about 18 hours -- don't ask).
If you do anything, raising a NOTICE that the index is deferred for usability might be a good thing, but the problems with delaying exit go well beyond deadlocks.
regards, tom lane
--
Best Wishes,
Chris Travers
Efficito: Hosted Accounting and ERP. Robust and Flexible. No vendor lock-in.