Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/
Thank you Magnus (and others) for your replies.
The raw input data are in this, slightly strange format: 2019.05.01 00:00:00. No timezone
indicator, just an informal guarantee from the supplier that it is indeed utc. And no real chance of changing the format. We know, from experience.
The data volume is a bit high, a few billion rows pr month. So, table partitioning is
very helpful (aka really indispensable). Data will be aggregated in several ways for analytics. Time aggregations must be according to our local timezone (cet). We do not want data from one day being aggregated into the wrong date because of timezone issues.
This means that partition boundaries (monthly pratitions most often, sometimes day partitions) must be on CET-boundaries so that partition pruning will pull data from the relevant cet-month not the utc-month.
Now, if I load data into a timestamptz with timezone set to utc, partition to cet-boundaries,
query and aggredate with timezone set to cet, everything wil be ok, I think. My small testcase below shows that the row goes into the april-partition (as it should). The planner does the correct partition pruning according to specified filtering and set timezone.
All good.
create table t (t_id bigserial, ts timestamptz) partition by range (ts);
create table t_2020_02 partition of t for values from ('2020-02-01 00:00:00+01') to ('2020-03-01
00:00:00+01');
create table t_2020_03 partition of t for values from ('2020-03-01 00:00:00+01') to ('2020-04-01
00:00:00+02');
create table t_2020_04 partition of t for values from ('2020-04-01 00:00:00+02') to ('2020-05-01
00:00:00+02');
set timezone to 'utc';
insert into t (ts) values('2020-03-31 23:30:00');
Once again, thank you for invaluable feedback.