Re: Avoiding surrogate keys

Merlin Moncure <mmoncure@xxxxxxxxx> · Tue, 4 May 2010 09:40:40 -0400

On Sat, May 1, 2010 at 4:14 PM, John R Pierce <pierce@xxxxxxxxxxxx> wrote:
>
> If your 'natural key' is a large text field, I'd have to assume there's some
> point at which a surrogate index would be more efficient.  Would this be
> above a few dozen characters, or a few 100 characters?   I wouldn't want a
> PK based on a multi-K byte text field for a table that has many 10s or 100s
> of 1000s of rows, for sure.

Well, yes, but:
*) most natural keys are small, between 4-10 bytes
*) regardless of the data type of the key, the number of btree lookups
is going to remain approximately the same
*) you have to (or at least should) put the index on anyway as unique
constraint. you do pay the price in dependent tables however.  so the
natural _primary_ key is free.  it's the foreign keys where you pay.

the two main performance issues with natural keys are this (you kinda
touched on one):
*) the index is fatter, pressuring cache
It's not so much the comparison function but the fact that the larger
index(es) require more memory. If drives were faster than they were
this wouldn't matter as much -- I expect this to become less of a
factor as SSD technology improves.  This can somewhat modulated by
clustering the index...you get  a better chance of reading multiple
relevant records on a single page.
*) cascading updates
If your key is in a lot of places and has to be updated it can cause a
mess. Lots of locks, dead space, vacuuming, etc. Most of the time
primary keys don't change very much but if they do you had better give
it fair consideration.

Natural keys have a lot of performance advantages as mentioned upthread.

merlin

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general