On 03/04/13 06:37, Merlin Moncure wrote: > On Tue, Apr 2, 2013 at 2:01 PM, Samantha Atkins <sjatkins@xxxxxx> wrote: >> Natural keys are in user data space. Thus they are not guaranteed invariant and therefore cannot serve as persistent identity. Can't find Samantha's original post. I agree but done right they can be (persistent and unique) > That is true, but irrelevant in most real world cases. Also, nothing > is keeping you from using an extra marker if/when you need to provide > an invariant lookup. > >> Also natural keys have the considerable defect of being of different type and arity per logical entity type. This means that very nice things like dynamic relationships and persistent collections of references to multiple types of things cannot be built. It also increases ORM and data cache complexity. > > OO evangelism. > >> These are considerable weaknesses. The Google cache is filled with "foo vs bar" i.e. natural vs surrogate and its amazing what you get, even surrogate key under wikipedia comes with a "its sources remain unclear because it lacks inline citations" disclaimer. I consider it pretty much a non debate. >> You cannot guess a uuid from say uuid.uuid4(). If you think you can I look forward to seeing your proof. > > I never claimed that. I said that postgresql random() can be guessed, > which it can, since it's based on lrand48. > > merlin > > Trying to get back on topic with the original post. I have the iso (?) country code table, I have no problem with the PK being the char(2) country code. These natural codes/keys are thought out, debated and eventually decided by other people. But I also don't have a problem with adding an integer (serial) column and making that my PK, although that PK might be pretty useless to the rest of the world. So thats that, having to really think it out is probably a good sign that you should stick to a surrogate unless you are really sure. (again I don't advocate ON UPDATE CASCADE as a solution should you change your mind) As to the whole natural vs surrogate/synthetic key debate, as I mentioned in an earlier post I use them both. The question is when is it worthy of a PK. 1)cust_id=123 (surrogate: PK) vs 2)cust_id=1231 (natural: checkbit such as barcode data etc) vs 3)cust_id=<natural: uuencoded binary that spits out "123" after being decoded by the app> For me, 2) is more likely to be a PK than 3), but it is entirely possible that neither would be a PK. Global/Universal unique identifiers, useful with replication and database partitioning (in my instance) 1)cust_id=<uuid> vs 2)cust_id=<shard_id>-<something_extra>-<cust_id> 1) will work, but 128bits is alot of random data that could be useful to the app. 2) cust_id is not as universally unique, but if that was ever a problem I could also wrap that in a encoded binary with a small change to the app and no change to the database now it resembles something truly random. The difference is 2) is more useful and contains "routing" data. These are all natural and exposed to the world. But the question still remains are they worthy of being a PK within the database of origin? So far the answer from me is "doesn't have to be" but everyone else might think it is, they may even make it a PK. Regards, Jules. -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general