Re: IS it a good practice to use SERIAL as Primary Key?

"Brandon Aiken" <BAiken@xxxxxxxxxxxxxxx> · Mon, 27 Nov 2006 14:21:07 -0500

Simply put, it doesn't scale as well.

If a table already has candidate keys, then you've presumably got unique
indices on them.  A surrogate primary key adds another segment of data
to pass through I/O and another index to maintain.  Under high loads,
those extra cycles will cost you transactions per minute.

If you're able to throw hardware at the problem to compensate for
performance and data size issues, it's not a problem.  Most databases
are run on systems that are overkill already.  If, OTOH, you're running
a system that needs to be able to process billions of transactions with
exabytes data (say, for example, a comprehensive multi-national health
record database) then you're going to be as interested in SQL tuning as
it's possible to be because no amount of hardware will be enough.

The other argument is that it's redundant data with no real meaning to
the domain, meaning using surrogate keys technically violates low-order
normal forms.

As far as data changing, if you're using foreign key constraints
properly you should never need to issue more than one UPDATE command.
ON UPDATE CASCADE is your friend.

It is always possible to design a domain model which perfectly captures
business logic.  However, it is *not* always possible to actually
implement that domain in a computerized RDBMS, nor is it always
practical.  Just as the domain model represents an estimated
implementation of the real world information, an RDBMS is just an
estimated implementation of the relational model.

--
Brandon Aiken
CS/IT Systems Engineer

-----Original Message-----
From: pgsql-general-owner@xxxxxxxxxxxxxx
[mailto:pgsql-general-owner@xxxxxxxxxxxxxx] On Behalf Of John McCawley
Sent: Monday, November 27, 2006 1:53 PM
To: Ron Johnson
Cc: pgsql-general@xxxxxxxxxxxxxx
Subject: Re: [GENERAL] IS it a good practice to use SERIAL as Primary
Key?

>Yes, that's the point.  They are legacy designs, and that portion of
>the design is wrong.
>
>  
>
I'll weigh in my my .02 on this subject.  After much pain and agony in 
the real world, I have taken the stance that every table in my database 
must have an arbitrary, numeric primary key (generally autogenerated).  
I have found that this gets me into a lot of arguments with other 
database guys, but never seems to cause any problems for me.  
Conversely, I have seen innumerable problems in the real world caused by

the usage of actual data as primary keys.

Perhaps I am amazingly ignorant, but I have yet to find a case where my 
approach causes any real problems.  What does using "real" data as a 
primary key buy you?  The only real advantages I can see are that an 
individual record's data will be somewhat more human-readable without 
joining to other tables, and that your search queries can be simpler 
because they don't have to join against other tables.

On the (many) occasions that I have worked on databases with "real" data

as primary keys, I just saw so many problems arise.  In the real world, 
data changes, even supposedly unchangeable data.  When using arbitrary 
primary keys, all you have to do is change the data in the one table 
where it lives.  If you are using real data as your keys, you have write

complex queries or code to "fix" your data when the supposedly 
unchangeable data changes.

Anyway, I'm sure this is a huge argument, but that's my 0.2

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend