On Wednesday 08 April 2009 18:25:25 Ron Mayer wrote: > Robert Treat wrote: > > You can be sure that discussion of this topic in this forum will soon be > > visited by religious zealots, but the short answer is "nulls are bad, > > mmkay". A slightly longer answer would be that, as a general rule, > > attributes of your relations that only apply to 1% of the rows are better > > represented as a one > > To fulfill your prophecy of zealotry, I've got a number of tables > with columns that are mostly null that I can't think of that nice a > way of refactoring. I'd love ideas to improve the design, though. > > One example's an address table. Most addresses have a few fields > that are typically present (building number, city, state, etc). > Others, as described in various government's address standards, > are fields that are typically absent. For example in US addressing > rules, the "Urbanization Name" line: > http://www.usps.com/ncsc/addressstds/addressformats.htm > MRS MARIA SUAREZ Name > URB LAS GLADIOLAS Urbanization name > 150 CALLE A House no. and st. name > SAN JUAN PR 00926-3232 City, state, and ZIP+4 > Similarly sparse columns in my address tables are, > titles, division/department Names and mailstop codes. > (described here: http://pe.usps.gov/text/pub28/pub28c3_011.htm) > > While I realize I could stick in some string (empty string, or > some other magic string like "urbanization name doesn't apply to > this address") into a table, it sure is convenient to put nulls > in those columns. > > I'm quite curious what you'd suggest a well-designed address table > would look like without nulls. The decision here would depend on your perticular sect of the anti-null religion, but you have a couple of choices: 1) Break these fields out into one or more tables, containing entries only for those address that have the additional information. Ideally you might be able to do something like "extended_address_info" where all of these fields could be kept, all of them being non-null. I suspect you can't do the ideal, so you'd end up with a bunch of tables. This would be used by the "normalization trumps nullification" sect 2) Given that all of these columns have an authoritarian source of what should be allowed, you could use the "magic string" approach without requiring too much magic, and these columns could even be a foriegn key into a table containing the authoritarian options. This could be justified by the all nulls are bad sect, but might also be used by a null using crowd who take a strict approach to nulls meaning "unknown value", since here it isn't that the value is unknown; there isn't a valid value for these columns. (Adding the magic string to your FK table creates a valid reference value for those entries that would otherwise not match) Personally, if you force me into a "well-designed address table *without* nulls" decision, I would take this latter approach. HTH -- Robert Treat Conjecture: http://www.xzilla.net Consulting: http://www.omniti.com -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general