Tom Lane wrote: > "rlee0001" <robeddielee@xxxxxxxxxxx> writes: > > ... I know, for example, that by default PostgreSQL assigns every record a > > small unique identifier called an OID. > > Well, actually, that hasn't been the default for some time, and even if > you turn it on it's not guaranteed unique without additional steps, and > seeing that it's only 32bits wide it's not usable as a unique ID for > very large tables. > > "unique" and "small" are more or less contradictory in this context. > > > ... it seems like this is the sort of thing that even a fully SQL-compliant > > DBMS could do internally to compensate for the performance issues with > > using large natural keys in relationships. > > What performance issues are those, exactly? I have seen no data that > proves that large key size is a bottleneck for reasonable schema designs > in Postgres. Maybe that just means that we have more fundamental > problems to fix :-( ... but there's no point in investing a lot of > effort on an aspect that isn't the next bottleneck. Well from what I understand, there are basically two reasons to use surrogate primary keys: 1) No reliable natural candidate key exists or 2) The natural candidate keys are percieved to be far too large/complex to use as the primary key. I have yet to hear anyone recommend the use of surrogate keys in modern databases for any other reason. Obviously that some entities in practice have no reliable natural keys cannot be helped and in those cases a surrogate key pretty much has to be used, enless the data modeler would find it acceptable to use every attribute in the entity as part of the primary key (in order to ensure uniqueness of each record overall). The second argument for the use of surrogate keys is simply that reliable natural candidate keys are often perceived to be too large to use as primary keys. The perception seems to be that large primary keys consume a considerable amount of storage space when used in foreign keys. For example, if I key "employee" by Last Name, First Name, Date of Hire and Department, I would need to store copies of all this data in any entity that relates to an employee (e.g. payroll, benefits and so on). In addition, if any of these fields change in value, that update would need to cascade to any related entities, which might be perceived as a performance issue if there are many related records. I'm not saying that PostgreSQL specifically has performance problems but that using large natural keys can hamper performance (both in terms of storage space required and cascading update time) when a lot of relationships exist between entities with such large keys. Personally I hate using surrogate keys except in places where no reliable natural key exists but find it nessisary in order to improve the efficiency of foreign keys in the database. So my proposal was simply to have the DBMS internally create and use an invisible surrogate key to identify and relate records, but expose the natural key to the environment. The currently OID implimentation cannot be used for this as you've already stated but I think a similar implimentation could work. I'm just wondering if anyone else would take advantage of the performance benefit, or perhaps sees a flaw in my logic. > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@xxxxxxxxxxxxxx so that your > message can get through to the mailing list cleanly