On 16/10/2010 9:58 AM, Navkirat Singh wrote:
> Hi Guys,
>
> I am interested in finding out the pros/cons of using UUID as a
primary key field. My requirement states that UUID would be perfect in my case as I will be having many small databases which will link up to a global database using the UUID. Hence, the need for a unique key across all databases. It would be extremely helpful if someone could help me figure this out, as it is critical for my project.
Pro: No need for (serverid,serverseq) pair primary keys or hacks with modulus based key generation. Doesn't set any pre-determined limit on how many servers/databases may be in a cluster.
Con: Slower than modulo key generation approach, uses more storage. Foreign key relationships may be slower too.
Overall, UUIDs seem to be a favoured approach. The other way people seem to do this is by assigning a unique instance id to each server/database out of a maximum "n" instances decided at setup time. Every key generation sequence increments by "n" whenever it generates a key, with an offset of the
server/database id. That way, if n=100, server 1 will generate primary keys 001, 101, 201, 301, ..., server 2 will generate primary keys 002, 102, 202, 302, ... and so on.
That works great until you need more than 100 instances, at which point you're really, REALLY boned. In really busy systems it also limits the total amount of primary key space - but with BIGINT primary keys, that's unlikely to be something you need to worry about.
The composite primary key (serverid,sequenceid) approach avoids the need for a pre-defined maximum number of servers, but can be slow to index and can require more storage, especially because of tuple headers.
I have no firsthand experience with any of these approaches so I can't offer you a considered opinion. I know that the MS-SQL crowd at least strongly prefer UUIDs, but they have very strong in-database UUID support. MySQL folks seem to mostly favour the modulo primary key generation approach. I
don't see much discussion of the issue here - I get the impression Pg doesn't see heavy use in sharded environments.
-- Craig Ringer
Tech-related writing at
http://soapyfrogs.blogspot.com/-- Sent via pgsql-performance mailing list (
pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance