Re: PostgreSQL Developer Best Practices

Adrian Klaver <adrian.klaver@xxxxxxxxxxx> · Tue, 25 Aug 2015 13:56:35 -0700

On 08/25/2015 08:40 AM, Melvin Davidson wrote:
Adrian,

Stop being so technical. When we/I speak of natural keys, we are talking
about the column
that would NATURALly lend itself as the primary key.

Pretty sure this is a technical list:)

No one ever said a number is not natural. just that there is no need to
duplicate uniqueness
with a separate number.

I would agree, but I have interacted with people, especially PHBes, 
where a duplicate 'hidden' key is a life saver. See more below.

IOW: If we have an account table, then the account_id or account_no
      would be the primary key. There is no need to have a separate
serial id as the primary key.

      Likewise, if we have a car table, then registration (or
vehicle_id) is preferred.

      EG: Good
      CREATE TABLE car
      (
       registration_no varchar(30) not null,
       car_make       varchar(25) not null,
       model          varchar(15) not null;
       build_year     date not null;
       owner          varchar(50),
       CONSTRAINT car_pk PRIMARY KEY (registration_no)
       );

      bad
      CREATE TABLE car
      (
       id              serial not null,
       registration_no varchar(30) not null,
       car_make       varchar(25) not null,
       model          varchar(15) not null;
       build_year     date not null;
       owner          varchar(50),
       CONSTRAINT car_pk PRIMARY KEY (id)
       );

The benefit in avoiding arbitrary and simple values for the key is that
it makes
the database design much more logical.

Consider:
SELECT c.registration_no,
        c.car_make,
        p.part_no
    FROM car c
    JOIN parts p ON ( p.registration_no = c.registration_no)
  WHERE registration_no = <some_var>;

Pretty sure parts are not unique to an exact vehicle, unless you are 
talking a totally handmade one. They are not even unique to make and 
model. As an example, I used to work on Class B Isuzu trucks. These 
models(FTR) where also built for Chevrolet as the Forward models. So 
right of the bat there where two part numbers for each part, one that 
started with 9 if you got it from Chevrolet and one with 11 from Isuzu, 
if memory serves. Then Isuzu decided to reorganize their part numbers, 
so that introduced another number, all pointing to the exact same part. 
Then there where those parts available from the parts houses(NAPA, etc).

Then there was the greenhouse I worked for where we supplied UPC coded 
tags for our customers. In the beginning, it was simple, the item 
portion of the UPC was unique and with the company prefix served as a 
'natural' key for the tags. Then the chain stores we worked with must 
have all gone to the same seminar on how to be Walmart and decided they 
did not want unique numbers, but UPCs tied to price groups that covered 
a variety of plants. Luckily, I was too stupid to know surrogate keys 
where bad and had a sequence attached to the tag table. This then became 
the tag id and made life a lot easier during the transition. It still 
remains there, because people are people and 'natural' tends to be 
artificial and transient.

  versus:
  SELECT c.registration_no,
        c.car_make,
        p.part_no
    FROM car c
    JOIN parts p ON ( p.id <http://p.id> = c.id <http://c.id>)
  WHERE registration_no = <some_var>;

  Why join on id when registration_no is better?

On Tue, Aug 25, 2015 at 10:17 AM, Adrian Klaver
<adrian.klaver@xxxxxxxxxxx <mailto:adrian.klaver@xxxxxxxxxxx>> wrote:

    On 08/24/2015 08:44 PM, Rob Sargent wrote:

            On Aug 24, 2015, at 6:53 PM, Melvin Davidson
            <melvin6925@xxxxxxxxx <mailto:melvin6925@xxxxxxxxx>> wrote:

            You are right, he was probably talking about FK's. I was
            just so frustrated about people insisting that using "ID" as
            the primary key in every table is a "good" idea,
            I didn't bother to reply previously. I stand firm on my
            belief that the primary key should be something meaningful
            and NOT "id" just for the sake of having a unique numeric key.

        What, pray tell, is the unique natural key of person in any
        meaningfully large domain such as state? Certainly not name +
        birthdate.  Current address isn’t guaranteed. Social isn’t
        reliable and actually not truly unique.

    To add:

    1) Who determined that a number is not natural?

    2) One of the older unique natural keys (genus, species) is not so
    unique. I am a fisheries biologist by training and in my time the
    'unique' identifier for various fishes has changed. Now that
    ichthyologists have discovered DNA testing, it can be expected there
    will be even more changes. This is even more apparent when you go
    back in in history. As an example:

    https://en.wikipedia.org/wiki/Rainbow_trout

    Rainbow trout

    Current

    Oncorhynchus mykiss

    Past

    Salmo mykiss Walbaum, 1792
    Parasalmo mykiss (Walbaum, 1792)
    Salmo purpuratus Pallas, 1814
    Salmo penshinensis Pallas, 1814
    Parasalmo penshinensis (Pallas, 1814)
    Salmo gairdnerii Richardson, 1836  <--The one I learned.
    Fario gairdneri (Richardson, 1836)
    Oncorhynchus gairdnerii (Richardson, 1836)
    Salmo gairdnerii gairdnerii Richardson, 1836
    Salmo rivularis Ayres, 1855
    Salmo iridea Gibbons, 1855
    Salmo gairdnerii irideus Gibbons, 1855
    Salmo irideus Gibbons, 1855
    Trutta iridea (Gibbons, 1855)
    Salmo truncatus Suckley, 1859
    Salmo masoni Suckley, 1860
    Oncorhynchus kamloops Jordan, 1892
    Salmo kamloops (Jordan, 1892)
    Salmo rivularis kamloops (Jordan, 1892)
    Salmo gairdneri shasta Jordan, 1894
    Salmo gilberti Jordan, 1894
    Salmo nelsoni Evermann, 1908

    All the above point to the same fish and have appeared and appear in
    articles and reports about said fish. Lets not even get into the
    common name situation:).

        Even given that there are models which are made of entities with
        legitimate attributes which per force define a unique instance,
        I see no benefit in avoiding the convenience of an arbitrary and
        simple value for the key.  Is it the overhead of generating and
        storing one more value per tuple that you can’t abide?

    --
    Adrian Klaver
    adrian.klaver@xxxxxxxxxxx <mailto:adrian.klaver@xxxxxxxxxxx>

--
*Melvin Davidson*
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.

--
Adrian Klaver
adrian.klaver@xxxxxxxxxxx

--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general