On 2013-03-15, lender <crlender@xxxxxxxxx> wrote: > Hello. > > We are currently redesigning a medium/large office management web > application. There are 75 tables in our existing PostgreSQL database, > but that number is artificially low, due to some unfortunate design choices. > > The main culprits are two tables named "catalog" and "catalog_entries". > They contain all those data sets that the previous designer deemed too > small for a separate table, so now they are all stored together. The > values in catalog_entries are typically used to populate dropdown select > fields. > So, my first main question would be: is it "normal" or desirable to have > that many tiny tables? And is it a problem that many of the tables have > the same (or a similar) column definitions? Dunno about "normal", but certainly "Normal" (as in "-form"). No problem. > The second point is that we have redundant unique identifiers in > catalog_entries (id and code). The code value is used by the application > whenever we need to find to one of the values. For example, for a query > like "show all open invoices", we would either - > > 1) select the id from catalog_entries where catalog_id refers to the > "invoice_status" catalog and the code is "open" > 2) use that id to filter select * from invoices > > - or do the same in one query using joins. This pattern occurs hundreds > of times in the application code. From a programming viewpoint, having > all-text ids would make things a lot simpler and cleaner (i.e., keep > only the "code" column). > > The "id" column was used (AFAIK) to reduce the storage size. Most of the > data tables have less than 100k records, so the overhead wouldn't be too > dramatic, but a few tables (~10) have more; one of them has 1.2m > records. These tables can also refer to the old catalog_entries table > from more than one column. Changing all these references from INT to > VARCHAR would increase the DB size, and probably make scans less > performant. I'm not sure know how indexes on these columns would be > affected. > > To summarize, the second question is whether we should ditch the > artificial numeric IDs and just use the "code" column as primary key in > the new tiny tables. I if they aren't hurting you keep them. > Thanks in advance for your advice. If you're worried about clutter It may make sense to put all the small tables in a separate schema. -- ⚂⚃ 100% natural -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general