Search Postgresql Archives

Re: Avoiding surrogate keys

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 21 April 2010 20:18, Merlin Moncure <mmoncure@xxxxxxxxx> wrote:
On Wed, Apr 21, 2010 at 3:01 PM, Thom Brown <thombrown@xxxxxxxxx> wrote:
> I think I know what I plan to do, but want to throw this out there to see if
> there are differing points of view.
> I have a mailing list table, and 2 of the columns contain values which have
> to be from a list.  These are country and status.  There are 237 possible
> countries and 3 possible statuses.  Now I know some people would assign a
> sequence ID (surrogate key) to the country and status values, and have them
> looked up in separate tables to get the textual value, but I think I'll
> still have those tables, just without an ID column, so 1 column for both the
> countries and statuses tables.  This means storing the proper value in the
> main table.
> So instead of
> name, email, country, status
> 'mr smith', 'emailaddress@xxxxxxxxxxx', 44, 2
> 'mrs jones', 'me@xxxxxxxxxxxxxxxx', 21, 1
> I'd have
> name, email, country, status
> 'mr smith', 'emailaddress@xxxxxxxxxxx', 'China', 'Registered'
> 'mrs jones', 'me@xxxxxxxxxxxxxxxx', 'Belgium', 'Unconfirmed'
> The values of course would be constrained by foreign key lookup to their
> associated tables.
> Are there any serious downsides to this?  If so, what would you recommend?

Natural keys:
*) force formal relationships into your key design (this is good)
*) Make your database MUCH easier to follow, browse, and understand
*) in particular cases allow you to skip joins
*) will make your indexes fatter (this is not good)
*) can be a pain if your keys are updated frequently
*) can be a major pain if your key changes in structure (adds a field,
or changes in type)

Surrogate keys:
*) Give you faster joins, but more of them (this is a win/loss
depending on circumstances)
*) Tend to encourage lazy/poor designs, since you hide relationships
behind a value
*) Make the tables more difficult to browse and understand
*) Make updates to keys/key structure trivial

I personally use natural keys when I can and surrogates when I have
to.  When I do use a surrogate, I tend to still define the natural key
as primary and simply make a alternate 'unique' constraint for the
surrogate.

merlin

Thanks for the comments guys.  I'm now pretty sure using natural keys is the right approach.  Neither list will ever change type, and if they change, it'll be quite infrequent.  Index size shouldn't really be a problem since we're realistically talking thousands of rows.

What I hate about surrogate keys is having to keep looking stuff up:

INSERT INTO stuff (col_a, col_b)
SELECT 'my_value', table_b.id
FROM table_b
WHERE table_b.real_value = 'PostgreSQL';

As opposed to:

INSERT INTO stuff (col_a, col_b) VALUES ('my_value', 'PostgreSQL');

Just making sure I don't fall into the surrogate key abuse trap. :)

Thom

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux