Re: Why does CREATE INDEX CONCURRENTLY need two scans?

Tom Lane <tgl@xxxxxxxxxxxxx> · Tue, 31 Mar 2015 23:54:38 -0400

Michael Paquier <michael.paquier@xxxxxxxxx> writes:
> On Wed, Apr 1, 2015 at 9:43 AM, Joshua Ma <josh@xxxxxxxxxxxxx> wrote:
>> Why are two scans necessary? What would break if it did something like the
>> following?
>> 
>> 1) insert pg_index entry, wait for relevant txns to finish, mark index
>> open for inserts
>> 
>> 2) build index in a single snapshot, mark index valid for searches

>> Wouldn't new inserts update the index correctly? Between the snapshot and
>> index-updating txns afterwards, wouldn't all updates be covered?

> When an index is built with index_build, are included in the index only the
> tuples seen at the start of the first scan. A second scan is needed to add
> in the index entries for the tuples that have been inserted into the table
> during the build phase.

More to the point: Joshua's design supposes that retail insertions into
an index can happen in parallel with index build.  Or in other words,
that index build consists of instantaneously creating an empty-but-valid
index file and then doing a lot of ordinary inserts into it.  That's a
possible design, but it's not very efficient, and most of our index AMs
don't do it that way.  btree, for instance, starts by sorting all the
entries and creating the leaf-level pages.  Then it builds the upper tree
levels.  It doesn't have a complete tree that could support retail
insertions until the very end.  Moreover, most of the work is done in
storage that's local to the backend running CREATE INDEX, and isn't
accessible to other processes at all.

			regards, tom lane

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general