On 9/27/07, Stefan Schwarzer <stefan.schwarzer@xxxxxxxxxxxx> wrote: > > I really tried it out. I changed my whole database to the "id-country | year > | value" - format. And then tried to build my SQL queries. But it was > really, really difficult, and sometimes impossible (for me) to get what I > wanted. > > Actually, it remains quite difficult for me to remember the actual problems > I had with it. > > But, for example, I need to aggregate the national data on-the-fly to their > regions. I need to calculate per Capita data on-the-fly for each variable. > Although, one would say this should be simple to accomplish, me and a > semi-professional database expert could hardly solve these things. You should have brought your problem here. You'd be surprised what a fresh set of eyes can see. > In one case we came up with as many sub-selects as years were available (or > selected by the user) (that can be up to 60 years). Is this "efficient" SQL > programming? Probably not. But that doesn't mean it was the right approach either. There well might have been a more efficient approach you didn't think of. > What would you recommend for say, 500 global national statistical variables, > 500 regional and 500 subregional and 500 global aggregations? Years being > covered having something between 10 and 60 years for each of these > variables. All available for 240 countries/territories. I generally approach such problems by putting the data right (normalized) at the start, then munging the data into summary tables to handle the problems you're seeing now. I find it far easier to maintain normalized tables that produced non-normalized ones (for things like data warehousing) than it is to maintain non-normalized tables and trying to produce normalized data from that. ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster