Re: UUID column as pimrary key?

Chris Browne <cbbrowne@xxxxxxx> · Wed, 05 Jan 2011 18:22:08 -0500

ajs@xxxxxxxxxxxxxxx (Andrew Sullivan) writes:
> On Wed, Jan 05, 2011 at 12:41:43PM -0700, Scott Ribe wrote:
>> I'm not sidestepping the point at all.
>
> You may be missing it, however, because. . .
>
>> The point is that the finiteness of the space is a red herring. The
>> space is large enough that there's no chance of collision in any
>> realistic scenario.
>> In order to get to a point where the probability
>> of collision is high enough to worry about, you have to generate
>> (and collect) UUIDs at a rate that is simply not realistic--as in
>> your second example quoted above.
>
> . . .the example was not that UUIDs are being generated and collected
> in one place at that rate, but that they're being generated in several
> independent places at a time, and if the cost of the collision is
> extremely high, there might be reasons not to use the UUID strategy
> but instead to use something else that is generated algorithmically by
> the database.  There's a trade-off in having distributed systems
> acting completely independently, and while I have lots of confidence
> in my colleagues at the IETF (and agree with you that for the
> overwhelming majority of cases UUIDs are guaranteed-unique enough),
> correctly making these trade-offs still requires thought and
> analysis.  It's exactly the kind of of analysis that professional
> paranoids like DBAs are for.

But it seems to me that some of the analytics are getting a little *too*
paranoid, on the "perhaps UUIDs are the wrong answer" side of the
column.

There's no panaceas, here; if the process that is using IDs is fragile,
then things can break down whether one is using UUID or SERIAL.

I prefer the "probably unique enough" side of the fence, myself.

And the process that uses the IDs needs to be robust enough that things
won't just fall apart in tatters if it runs into non-uniqueness.

I'd expect that to not need to be a terribly big deal - if there's a
UNIQUE index on a UUID-based column, then an insert will fail, and the
process can pick between things like:
 - Responding that it had a problem, or
 - Retrying.

And if the system isn't prepared for that sort of condition, then it's
also not prepared for some seemingly more likely error conditions such
as:
 - The DB connection timed out because something fuzzed out on the
   network
 - The DB server fell over and is restarting because (power failed,
   someone kicked the switch, disk ran out, ...)

It seems rather silly to be *totally* paranoid about the
not-infinite-uniqueness of UUIDs when there are plenty of other risks
lurking around that also need erro checking.
-- 
"cbbrowne","@","gmail.com"
http://linuxdatabases.info/info/slony.html
"How can you dream the impossible dream when you can't get any sleep?"
-- Sam Robb

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general