Search Postgresql Archives

Re: Suggest note in index documentation about long running transactions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On Tue, Feb 16, 2016 at 6:04 PM, Tom Lane <tgl@xxxxxxxxxxxxx> wrote:
I wrote:
> Chris Travers <chris.travers@xxxxxxxxx> writes:
>> Unless there is a sense that this is a bad idea I will submit a doc patch.

> I was already working on it ... I think what we want is something along
> this line in the "Building Indexes Concurrently" section of the CREATE
> INDEX ref page:

After further perusal of the code I propose replacing that para with this
wording:

   <para>
    In a concurrent index build, the index is actually entered into
    the system catalogs in one transaction, then two table scans occur in
    two more transactions.  Before each table scan, the index build must
    wait for existing transactions that have modified the table to terminate.
    After the second scan, the index build must wait for any transactions
    that have a snapshot (see <xref linkend="mvcc">) predating the second
    scan to terminate.  Then finally the index can be marked ready for use,
    and the <command>CREATE INDEX</> command terminates.
    Even then, however, the index may not be immediately usable for queries:
    in the worst case, it cannot be used as long as transactions exist that
    predate the start of the index build.
   </para>

This is a good deal clearer, IMO, about the conditions under which
transactions block CREATE INDEX CONCURRENTLY; we need that because
otherwise it's not going to make much sense to talk about old
transactions still existing after the command completes.

Agreed.

The reason for the "worst case" weasel-wording is that the problem you
saw doesn't actually occur unless the index build detected some broken
HOT chains.  I do not want to get into explaining what those are here,
so it seemed best to just be vague about whether there's a delay in
index usability or not.

I think that is also a good optimization, documentation wise.

(BTW, I wondered whether this wasn't just a bug and we should make things
less confusing by having CREATE INDEX CONCURRENTLY not complete until
the index is fully usable.  However, it appears the reason we don't do
that is it would create a risk of two CREATE INDEX CONCURRENTLY commands
deadlocking, ie they'd each think they have to wait for the other one.)

I think even without the deadlocking that would be far worse than the current behavior.   The current behavior is a bit opaque when it happens (and in this case I could certainly see HOT chains being a problem case on this db as it is large, but also that this specific table has tremendous turnover, and a few transactions which read from the table can be extremely long running -- have shortened the longest running case from about 4 days to about 18 hours -- don't ask).

If you do anything, raising a NOTICE that the index is deferred for usability might be a good thing, but the problems with delaying exit go well beyond deadlocks.

                        regards, tom lane



--
Best Wishes,
Chris Travers

Efficito:  Hosted Accounting and ERP.  Robust and Flexible.  No vendor lock-in.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux