On Sat, Sep 12, 2009 at 2:11 PM, Bill Moran <wmoran@xxxxxxxxxxxxxxxxx> wrote: > On Sun, 13 Sep 2009 02:35:02 +1000 > Grant Maxwell <grant.maxwell@xxxxxxxxxxxx> wrote: > >> Hi Folks >> >> I'm looking for a bit of advice regarding alpha primary keys. >> >> I have a table (designed by someone else) that has a numeric primary >> key and also a unique non-null email address field. >> >> The use of the primary key is causing me some headaches in that in >> multiple database server environment each server allocates a unique >> number from a range >> (and that works fine) but when the table is replicated (master-master- >> master) the exception handling is a bit tricky because each database >> server may have >> records that are duplicate at the email address field - with a >> different primary key number. >> >> I don't know why it was done this way but it seems to me that the >> email addresses are unique, non null and could be used as the primary >> key. This would make the >> replication much faster and simpler. >> >> Does anyone out there think the change (number to email address as >> primary key) would be a bad idea ? >> >> your thoughts would be appreciated. > > I think you have two issues here, and if you identify them as such, you'll > come up with a better decision. > > The first is, "Should the email address be my primary key". And the answer > to that really depends on whether you expect it to change a lot, whether you > have a lot of FK relations and how easy/difficult it will be to manage those. > > The second, which may influence the first, is "how do I manage conflicts > when data from different sources is merged?" which is a huge, complex > question. One potential answer is to replace your integer surrogate key > with a GUID key, which will be unique from all the different sources. You guid solves the surrogate issue wrt data transfer, but glosses over what happens when you have duplicates. > could also use a "prefix" system to ensure uniqueness. Another option is > to use the email address itself. I'm surprised by your approach, as > personally, I've seen a LOT of people who share a single email address > (husband/wife, for example). It's not horribly common, since getting that's going to depend on how it's defined in the app. couple of different approaches: email _must_ be unique, and we care not to whom it belongs: email (email primary key) contact (email references email [...]) we do care option 1: email (email, memo text, primary key(email, memo)) contact (email, memo, references email(email, memo)) storing info in memo field to distinguish the different users we do care option 2: email (email primary key) contact (contact_id) email_contact_map ( contact_id references contact, email references email, memo, -- 'person a', 'person b', etc primary key(contact_id, email), ) The point is this: if you (the user) needs to distinguish between email adress users, that information should be in the database...allowing multiple entry of email addresses via serial or guid allows you to sneak by this requirement merlin -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general