Search Postgresql Archives

Re: Calculate Weighted Aggregation between Two Tables

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 09/19/07 04:13, Stefan Schwarzer wrote:
> Hi,
> 
> I have global national statistics (GDP, Fish Catch etc.), which I would
> like to aggregate to and calculate per Capita data for given regions
> (Africa, Europe...) and subregions (Western Africa, ....) on-the-fly.
> 
> From a statistical point of view it seems best to use something like this:
> 
>     given that the variable is "v" and the Total Population table is "tp":
> 
>    (country_1_of_region_1_for_v / country_1_of_region_1_for_tp)
> + (country_2_of_region_1_for_v / country_2_of_region_1_for_tp) + ....
>  
>  ----------------------------------------------------------------
>                                                           
>  (sum_countries_all_of_region_1_for_v /
> sum_countries_all_of_region_1_for_tp)
> 
> 
>     and then same thing for the other regions (in my case a total of 6
> regions).
> 
> 
> I have a table of countries with a field country_id and region_id (which
> specifies to which region this country belongs).
> I have the table for the variable with a field country_id, which has
> kind of a "Excel design", i.e. columns for each year.

Well that's a Very Bad Thing.  (Wasn't there a thread about this
last week?)

Since the "years" are columns, are the "variables" in rows?

> And I have a table of regions with a field region_id.
> 
> It seems like a rather complicated thing to do, as one needs first to
> get all the countries of a specific region for both tables (variable &
> pop_total), have the SUM(of all countries) available and run the
> mathematical procedure... 
> 
> Could someone give me a hint how this could be achieved?

This would be bog-simple if your tables were structured like this:

CREATE TABLE T_REGION (
REGION_ID CHAR(8) PRIMARY KEY,
DESCRIP  TEXT);

CREATE TABLE T_COUNTRY (
COUNTRY_ID  CHAR(8) PRIMARY KEY,
REGION_ID   CHAR(8) FOREIGN KEY REFERENCES (T_REGION.REGION_ID),
DESCRIP     TEXT);

CREATE TABLE T_NATIONAL_STATISTICS (
COUNTRY_ID CHAR(8),
STAT_YEAR  SMALLINT CHECK VALUES BETWEEN(1900 AND 2020),
FORECAST   BOOL,
GOVT_TYPE  CHAR(4) FOREIGN KEY REFERENCES (T_GOVERNMENT.GOVT_TYPE),
POPULATION BIGINT,
GDP        BIGINT,
FISH_CATCH BIGINT,
FISH_CONSUMED BIGINT,
WHEAT_HARVEST BIGINT,
WHEAT_CONSUMED BIGINT
CRUDE_PROD_BBL BIGINT, -- crude oil production,
CRUDE_CONSUM_BBL BIGINT, -- crude oil consumtion,
etc,
etc,
etc,
PRIMARY_KEY (COUNTRY_ID, STAT_YEAR));

(Syntax is a bit off, but you get the meaning...)
(I like human-readable keys.)

Whenever you find a new statistic you want to track, you add a new
column to T_NATIONAL_STATISTICS.

So, to (I think) get info on per-capita fish "usage" by year, by
country, for West Africa, you do:

SELECT NS.STAT_YEAR,
       NS.COUNTRY_ID,
       CAST(CAST(NS.FISH_CATCH AS DOUBLE)/NS.POPULATION
                AS NUMERIC(8,2)) AS CATCH_PERCAP,
       CAST(CAST(NS.FISH_CONSUMED AS DOUBLE)/NS.POPULATION
                AS NUMERIC(8,2)) AS EATEN_PERCAP,
       CAST(CAST(NS.FISH_CATCH AS DOUBLE)/NS.POPULATION) -
             CAST(NS.FISH_CATCH AS DOUBLE)/NS.POPULATION)
             AS NUMERIC(8,2)) AS FIST_SELF_SUFFICIENCY
FROM T_REGION R,
     T_COUNTRY C,
     T_NATIONAL_STATISTICS NS
WHERE R.REGION_ID = 'WESTAFRI'
  AND C.REGION_ID = R.REGION_ID
  AND NS.COUNTRY_ID = C.COUNTRY_ID;

You might want to left outer-join NS to C to indicate any west
African countries with missing statistics.

- --
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFG8QGvS9HxQb37XmcRAprKAKCiDphSaLe1nTIf8tfaIwNCx/zZngCgpfeo
QF+ojV0/jYlLbLP+BDV86p4=
=7o+G
-----END PGP SIGNATURE-----

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
       choose an index scan if your joining column's datatypes do not
       match

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux