Search Postgresql Archives

big un stacking query - save me from myself

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I have a fairly large table with two columns that I need to "de-normalize" (235 million rows) There has got to be a better (i.e. faster) approach than what I am doing. I am using a MAX CASE on each of the 24 variables (column names variable and value) that I want to unstack.  Any suggestions would be most appreciated.  Here is the ANALYZE 

GroupAggregate  (cost=107156950.16..174988710.12 rows=23511876 width=286)
  ->  Sort  (cost=107156950.16..107744747.04 rows=235118752 width=286)
        Sort Key: site, canopy, block, plot, measurement_interval, warming_treatment, treatment_code, treatment_abbr, water_treatment, variable_name, variable_channel, variable_id, rowid, time2, "timestamp", block_name, table_name, batt_volt, program, flag1, flag2, flag3, amb_a_avg, amb_closed_avg, airtc_avg, airtemp_avg, airtemp_max, airtemp_min, all_avgt, am25tref1, ptemp_avg, rh, s_all_avgt_avg, vp_avg, tabove_sdupper, tabove_sdlower, tabove_meantrim, tabove_mean_dc, tsoil_sdupper, tsoil_sdlower, tsoil_meantrim, tsoil_mean_dc
        ->  Seq Scan on derived_15min_joined  (cost=0.00..11782048.52 rows=235118752 width=286)
(4 rows)


And here is the table:


b4warmed3=# \d derived_15min_joined
             Table "public.derived_15min_joined"
       Column        |            Type             | Modifiers 
----------------------+-----------------------------+-----------
site                 | character varying(6)        | 
canopy               | character varying(24)       | 
block                | character(2)                | 
plot                 | character(6)                | 
measurement_interval | interval                    | 
warming_treatment    | character varying(24)       | 
treatment_code       | character varying(24)       | 
treatment_abbr       | character varying(24)       | 
water_treatment      | character varying(24)       | 
variable_name        | character varying(24)       | 
variable_channel     | character varying(24)       | 
variable_id          | character varying(24)       | 
rowid                | character varying(48)       | 
time2                | timestamp without time zone | 
timestamp            | timestamp without time zone | 
block_name           | character varying(8)        | 
table_name           | character varying(10)       | 
batt_volt            | real                        | 
program              | character varying(48)       | 
flag1                | integer                     | 
flag2                | integer                     | 
flag3                | integer                     | 
amb_a_avg            | real                        | 
amb_closed_avg       | real                        | 
airtc_avg            | real                        | 
airtemp_avg          | real                        | 
airtemp_max          | real                        | 
airtemp_min          | real                        | 
all_avgt             | real                        | 
am25tref1            | real                        | 
ptemp_avg            | real                        | 
rh                   | real                        | 
s_all_avgt_avg       | real                        | 
vp_avg               | real                        | 
tabove_sdupper       | real                        | 
tabove_sdlower       | real                        | 
tabove_meantrim      | real                        | 
tabove_mean_dc       | real                        | 
tsoil_sdupper        | real                        | 
tsoil_sdlower        | real                        | 
tsoil_meantrim       | real                        | 
tsoil_mean_dc        | real                        | 
variable             | text                        | 
value                | real                        | 
Indexes:
   "derived_15min_joined_lower_idx" btree (lower(variable_name::text))
   "derived_15min_joined_time2_idx" btree (time2)

b4warmed3=# 

And here is my query. 

SELECT
	site,
	canopy,
	block,
	plot,
	measurement_interval,
	warming_treatment,
	treatment_code,
	treatment_abbr,
	water_treatment,
	variable_name,
	variable_channel,
	variable_id,
	rowid,
	time2,
	timestamp,
	block_name,
	table_name,
	batt_volt,
	program,
	flag1,
	flag2,
	flag3,
	amb_a_avg,
	amb_closed_avg,
	airtc_avg,
	airtemp_avg,
	airtemp_max,
	airtemp_min,
	all_avgt,
	am25tref1,
	ptemp_avg,
	rh,
	s_all_avgt_avg,
	vp_avg,
	tabove_sdupper,
	tabove_sdlower,
	tabove_meantrim,
	tabove_mean_dc,
	tsoil_sdupper,
	tsoil_sdlower,
	tsoil_meantrim,
	tsoil_mean_dc,
	MAX (
		CASE
		WHEN lower(variable_name) = 'tabove' THEN

		value

		END
	) AS tabove,
	MAX (
		CASE
		WHEN lower(variable_name) = 'tabove_sc' THEN

		value

		END
	) AS tabove_sc,
	MAX (
		CASE
		WHEN lower(variable_name) = 'tabove_delta' THEN

		value

		END
	) AS tabove_delta,
	MAX (
		CASE
		WHEN lower(variable_name) = 'tsoil' THEN

		value

		END
	) AS tsoil,
	MAX (
		CASE
		WHEN lower(variable_name) = 'tsoil_sc' THEN

		value

		END
	) AS tsoil_sc,
	MAX (
		CASE
		WHEN lower(variable_name) = 'tsoil_delta' THEN

		value

		END
	) AS tsoil_delta,
	MAX (
		CASE
		WHEN lower(variable_name) = 's_sdm_out' THEN

		value

		END
	) AS s_sdm_out,
	MAX (
		CASE
		WHEN lower(variable_name) = 'sbtemp' THEN

		value

		END
	) AS sbtemp,
	MAX (
		CASE
		WHEN lower(variable_name) = 'heat_a_avg' THEN

		value

		END
	) AS heat_a_avg,
	MAX (
		CASE
		WHEN lower(variable_name) = 'b_dc_avg' THEN

		value

		END
	) AS b_dc_avg,
	MAX (
		CASE
		WHEN lower(variable_name) = 'targettemp' THEN

		value

		END
	) AS targettemp,
	MAX (
		CASE
		WHEN lower(variable_name) = 's_scldout' THEN

		value

		END
	) AS s_scldout,
	MAX (
		CASE
		WHEN lower(variable_name) = 'tmv' THEN

		value

		END
	) AS tmv,
	MAX (
		CASE
		WHEN lower(variable_name) = 'a_dc' THEN

		value

		END
	) AS a_dc,
	MAX (
		CASE
		WHEN lower(variable_name) = 'a_targettemp' THEN

		value

		END
	) AS a_targettemp,
	MAX (
		CASE
		WHEN lower(variable_name) = 'scldout' THEN

		value

		END
	) AS scldout,
	MAX (
		CASE
		WHEN lower(variable_name) = 'pid_lmt' THEN

		value

		END
	) AS pid_lmt,
	MAX (
		CASE
		WHEN lower(variable_name) = 'targettemp_adj' THEN

		value

		END
	) AS targettemp_adj,
	MAX (
		CASE
		WHEN lower(variable_name) = 'sdm_out' THEN

		value

		END
	) AS sdm_out,
	MAX (
		CASE
		WHEN lower(variable_name) = 's_pid_lmt' THEN

		value

		END
	) AS s_pid_lmt,
	MAX (
		CASE
		WHEN lower(variable_name) = 'tsoilr' THEN

		value

		END
	) AS tsoilr,
	MAX (
		CASE
		WHEN lower(variable_name) = 's_pid_out' THEN

		value

		END
	) AS s_pid_out,
	MAX (
		CASE
		WHEN lower(variable_name) = 'sctemp' THEN

		value

		END
	) AS sctemp,
	MAX (
		CASE
		WHEN lower(variable_name) = 'amb_avg' THEN

		value

		END
	) AS amb_avg
FROM
	derived_15min_joined
GROUP BY
	site,
	canopy,
	block,
	plot,
	measurement_interval,
	warming_treatment,
	treatment_code,
	treatment_abbr,
	water_treatment,
	variable_name,
	variable_channel,
	variable_id,
	rowid,
	time2,
	timestamp,
	block_name,
	table_name,
	batt_volt,
	program,
	flag1,
	flag2,
	flag3,
	amb_a_avg,
	amb_closed_avg,
	airtc_avg,
	airtemp_avg,
	airtemp_max,
	airtemp_min,
	all_avgt,
	am25tref1,
	ptemp_avg,
	rh,
	s_all_avgt_avg,
	vp_avg,
	tabove_sdupper,
	tabove_sdlower,
	tabove_meantrim,
	tabove_mean_dc,
	tsoil_sdupper,
	tsoil_sdlower,
	tsoil_meantrim,
	tsoil_mean_dc 
;





-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux