Re: Could postgres12 support millions of sequences? (like 10 million)

"Peter J. Holzer" <hjp-pgsql@xxxxxx> · Sun, 22 Mar 2020 12:27:33 +0100

On 2020-03-21 14:51:35 -0600, Rob Sargent wrote:
> > On Mar 21, 2020, at 1:13 PM, Peter J. Holzer <hjp-pgsql@xxxxxx> wrote:
> > 
> > On 2020-03-21 12:55:33 -0600, Rob Sargent wrote:
> >> To me the description of the ID smacks of database-in-the-name folly. I
> >> recognize that OP is likely unable to take another path. I’ll not push this any
> >> further. 
> > 
> > Identifiers often have internal structure. In Austria for example, the
> > social security number contains the birth date. Invoice numbers, project
> > numbers or court case numbers often contain the year.
> > 
> > That's because they are used by *people*, and people like their
> > identifiers to make some kind of sense. The computer doesn't care.
> 
> Since OP said this was digital not paper, I see this as a presentation
> problem bleeding into database design (assuming your response was an
> invitation to push further).

Well, that's the old natural primary key vs. surrogate primary key
debate.

(Trigger warning: You probably already heard that a gazillion times)

A primary key is an attribute (or a combination of attributes) which
unambiguosly identifies each entity. 

If the entity in question already has such an atttribute (e.g. an SSN,
invoice number, username, ...) and you use that as the primary key, is
is called a natural primary key[1].

If you generate a new attribute not related to existing attributes it is
called a surrogate (primary) key.

If the entity already has an attribute suitable as a primary key, why
would you want to use another one?

* The key candidate may be unwieldy: It might be a combination of
  several attributes, it might be a very long string, etc.
* You may suspect that the key candidate is not in fact a key. There
  have been cases of duplicate SSNs in several countries, and of course
  not every person has an SSN.

Some people say you should always use surrogate keys, never natural
keys. Mostly because of the second reason, I think: When you design a
system you have limited knowledge, so you can never be absolutely sure
that what appears to be a key candidate actually is one.

OTOH, surrogate keys have downsides, too: The attributes which would
have been a natural key are still there and must be stored, input,
output, maybe generated, used in business rules, etc. So that's a
(usually modest) increase in application complexity. More importantly,
everywhere you would have had a (to a domain expert) meaningful key you
now have a meaningless key - which makes it harder to understand the
data and makes extra joins necessary. So to inspect the contents of a
table instead of a simple "select * from tablename" you need a join over
three or four tables.

To get back to the OP's problem: As I understand it he needs to generate
that compound identifier anyway, because it is used outside of the
application. He could use a surrogate key in addition to that, but I
don't see any advantage to that. Most importantly it doesn't solve the
problem he wants to discuss in this thread[2].

        hp

[1] But note that these attributes are almost always already artificial:
    Humans aren't born with an SSN - that is assigned by an authority to
    uniquely identify their clients; courts have sat for centuries
    without the need to number their cases; etc.

[2] I admit I haven't been very helpful in this regard, either, going
    off on tangents at every opportunity.

-- 
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp@xxxxxx         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"
Attachment:
signature.asc

Description: PGP signature