On 2020-03-21 14:51:35 -0600, Rob Sargent wrote: > > On Mar 21, 2020, at 1:13 PM, Peter J. Holzer <hjp-pgsql@xxxxxx> wrote: > > > > On 2020-03-21 12:55:33 -0600, Rob Sargent wrote: > >> To me the description of the ID smacks of database-in-the-name folly. I > >> recognize that OP is likely unable to take another path. I’ll not push this any > >> further. > > > > Identifiers often have internal structure. In Austria for example, the > > social security number contains the birth date. Invoice numbers, project > > numbers or court case numbers often contain the year. > > > > That's because they are used by *people*, and people like their > > identifiers to make some kind of sense. The computer doesn't care. > > Since OP said this was digital not paper, I see this as a presentation > problem bleeding into database design (assuming your response was an > invitation to push further). Well, that's the old natural primary key vs. surrogate primary key debate. (Trigger warning: You probably already heard that a gazillion times) A primary key is an attribute (or a combination of attributes) which unambiguosly identifies each entity. If the entity in question already has such an atttribute (e.g. an SSN, invoice number, username, ...) and you use that as the primary key, is is called a natural primary key[1]. If you generate a new attribute not related to existing attributes it is called a surrogate (primary) key. If the entity already has an attribute suitable as a primary key, why would you want to use another one? * The key candidate may be unwieldy: It might be a combination of several attributes, it might be a very long string, etc. * You may suspect that the key candidate is not in fact a key. There have been cases of duplicate SSNs in several countries, and of course not every person has an SSN. Some people say you should always use surrogate keys, never natural keys. Mostly because of the second reason, I think: When you design a system you have limited knowledge, so you can never be absolutely sure that what appears to be a key candidate actually is one. OTOH, surrogate keys have downsides, too: The attributes which would have been a natural key are still there and must be stored, input, output, maybe generated, used in business rules, etc. So that's a (usually modest) increase in application complexity. More importantly, everywhere you would have had a (to a domain expert) meaningful key you now have a meaningless key - which makes it harder to understand the data and makes extra joins necessary. So to inspect the contents of a table instead of a simple "select * from tablename" you need a join over three or four tables. To get back to the OP's problem: As I understand it he needs to generate that compound identifier anyway, because it is used outside of the application. He could use a surrogate key in addition to that, but I don't see any advantage to that. Most importantly it doesn't solve the problem he wants to discuss in this thread[2]. hp [1] But note that these attributes are almost always already artificial: Humans aren't born with an SSN - that is assigned by an authority to uniquely identify their clients; courts have sat for centuries without the need to number their cases; etc. [2] I admit I haven't been very helpful in this regard, either, going off on tangents at every opportunity. -- _ | Peter J. Holzer | Story must make more sense than reality. |_|_) | | | | | hjp@xxxxxx | -- Charles Stross, "Creative writing __/ | http://www.hjp.at/ | challenge!"
Attachment:
signature.asc
Description: PGP signature