Search Postgresql Archives

Re: Database/Table Design for Global Country Statistics

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Stefan Schwarzer wrote:
Hi there,

I learned in another posting that my table design - in a polite way - "could be improved".

So, before doing any additional design errors, I would like to get feedback, if possible.

I am dealing with some 500 tables for worldwide national statistics (GDP, population, environment etc.), covering approx. 30 years each. For each of these variables, I usually have as well (pre-prepared) subregional and regional aggregations too. These could - and should - at the end be calculated on-the-fly, and not pre-calculated and imported from Excel as it is for the moment.

Might want a trigger to keep them up to date.

My (national) table for a given variable is in the moment as follows (id being the identifier for a specific country):

id   |    1970    |    1971    |    ...    |    2004    |    2005
-------------------------------------------------------------------
 1   |   NULL    |      36       |   ...     |      42      |      45
 2 ......

Ick - fiddly if you we ever reach 2006...

The new design would be like this:

id   |    year    |    value
-------------------------------
 1   |   1970    |     NULL
 1   |   1971    |      36
 1   ....
 1   |   2005    |      45
 2   |   1970    |      ....
 2   .....


Would that be considered as "good table design" then?

Well if "id" is a country I'd rename it "country_id".

If all years+countries should have the same set of measurements you might want it in one table:
 (country_id, year, population, area, roads_in_km, ...)

However, if some measurements aren't relevant you're probably better off with separate table for each measurement: country_population, country_area etc.

The issue with NULLs is what do they mean. They should mean "unknown", nothing more and nothing less. However, frequently you want to distinguish between "no figure available" and "not applicable" (e.g. "monarch" isn't relevant except in monarchies).

In this case it's probably best practice to separate out the values:
 (country_id integer, year integer, has_monarch boolean, monarch text)
This way you can distinguish between has_monarch=false and monarch=''

--
  Richard Huxton
  Archonet Ltd

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux