Changing primary key of large table, with foreign keys, without locking

Martijn van Oosterhout <kleptog@xxxxxxxxx> · Thu, 21 Nov 2013 00:28:41 +0100

I have a database with a rather large events table defined something
like:

# create table events(a int4, b int4, primary key(b,a));
CREATE TABLE

There are more columns, but thy are not relevent here. What I do have
is 7(!) tables that reference this one like so:

# create table attr1(a int4, b int4, foreign key (a,b) references events(a,b));

Note the foreign key references (a,b), the index on (b,a) is not really
useful since lookups by b only are pointless. There is also an index on
events(a) which I'd like to remove by simply making a primary key on
(a,b).

This table has 100 million rows and downtime is somewhat undesirable.

What I thought I'd do is create a new primary key index, create new
foreign keys, fake the validation and then drop the old primary key.
But it requires some catalog manipulation which I'm pretty sure is
safe, but I hope others can verify.

We start with:

# \d events
    Table "public.events"
 Column |  Type   | Modifiers 
--------+---------+-----------
 a      | integer | not null
 b      | integer | not null
Indexes:
    "events_pkey" PRIMARY KEY, btree (b, a)
Referenced by:
    TABLE "attr1" CONSTRAINT "attr1_a_fkey" FOREIGN KEY (a, b) REFERENCES events(a, b)

So I've come up with the following schema:

Step 1: Create a new primary key, without locking

# create unique index concurrently events_a_b_pkey on events(a,b);
# update pg_index set indisprimary=false where indrelid='events'::regclass and indisprimary=true;
# alter table events add primary key using index events_a_b_pkey;

This should be safe because marking an index non-primary doesn't change
anything really.

Now we have:

# \d events
    Table "public.events"
 Column |  Type   | Modifiers 
--------+---------+-----------
 a      | integer | not null
 b      | integer | not null
Indexes:
    "events_a_b_pkey" PRIMARY KEY, btree (a, b)
    "events_pkey" UNIQUE, btree (b, a)
Referenced by:
    TABLE "attr1" CONSTRAINT "attr1_a_fkey" FOREIGN KEY (a, b) REFERENCES events(a, b)

Step 2: create new foreign keys, wthout locking

# alter table attr1 add foreign key (a,b) references events not valid;
# update pg_constraint set convalidated=true where conname='attr1_a_fkey1';

This is safe because it's identical to the other foreign key, except
that the dependancies are different.  Note it is very important *not*
to specify the columns on the events table, or PostgreSQL picks the
wrong index to associate with.

Now we have:

# \d events
    Table "public.events"
 Column |  Type   | Modifiers 
--------+---------+-----------
 a      | integer | not null
 b      | integer | not null
Indexes:
    "events_a_b_pkey" PRIMARY KEY, btree (a, b)
    "events_pkey" UNIQUE, btree (b, a)
Referenced by:
    TABLE "attr1" CONSTRAINT "attr1_a_fkey" FOREIGN KEY (a, b) REFERENCES events(a, b)
    TABLE "attr1" CONSTRAINT "attr1_a_fkey1" FOREIGN KEY (a, b) REFERENCES events(a, b)

Step 3: Remove original primary key

# alter table events drop constraint events_pkey cascade;
NOTICE:  drop cascades to constraint attr1_a_fkey on table attr1
ALTER TABLE

And we're done!

# \d events
    Table "public.events"
 Column |  Type   | Modifiers 
--------+---------+-----------
 a      | integer | not null
 b      | integer | not null
Indexes:
    "events_a_b_pkey" PRIMARY KEY, btree (a, b)
Referenced by:
    TABLE "attr1" CONSTRAINT "attr1_a_fkey1" FOREIGN KEY (a, b) REFERENCES events(a, b)

Voila!

Am I missing anything? It's not pretty, but it reduces the problem to a
few short exclusive locks, rather than hours of downtime scanning
tables.

PG 9.1.10 FWIW.
-- 
Martijn van Oosterhout   <kleptog@xxxxxxxxx>   http://svana.org/kleptog/
> He who writes carelessly confesses thereby at the very outset that he does
> not attach much importance to his own thoughts.
   -- Arthur Schopenhauer
Attachment:
signature.asc

Description: Digital signature