PostgreSQL's materialized view functionality is very useful, but one
problem is that when the view gets large, there is no way to refresh
part of it. I know that table partitioning is coming in Postgres 10,
but I haven't heard anything about ideas for partitioning / sharding of
materialized views, so I'm trying to work out a strategy for doing it
manually until that's supported in the future.
Because there's no table inheritance for materialized views, the only
way I can think of to be able to get this behavior is to create
materialized views manually for each partition (e.g. each month) and
then use a UNION ALL view to act as the "parent" table. It looks
something like this:
CREATE MATERIALIZED VIEW prices_2016_04 AS (
SELECT sale_date, price FROM tbl
WHERE sale_date >= '2016/04/01' AND sale_date < '2016/05/01'
);
CREATE MATERIALIZED VIEW prices_2016_05 AS (
SELECT sale_date, price FROM tbl
WHERE sale_date >= '2016/05/01' AND sale_date < '2016/06/01'
);
CREATE MATERIALIZED VIEW prices_2016_06 AS (
SELECT sale_date, price FROM tbl
WHERE sale_date >= '2016/06/01' AND sale_date < '2016/07/01'
);
CREATE VIEW prices_2016 AS (
SELECT * FROM prices_2016_04
UNION ALL
SELECT * FROM prices_2016_05
UNION ALL
SELECT * FROM prices_2016_06
);
The problem comes when these views get large. Unlike when the
underlying objects are tables, I see no way to create CHECK constraints
on these materialized views, which means that queries like:
SELECT * FROM prices_2016 WHERE sale_date = '2016/04/15'
end up searching through all of the underlying materialized views, even
though the date in the query will only ever match one of them.
As a workaround, I have added logic to my queries to only search tables
for months where the time filters could match, but it would be ideal if
we could create CHECK constraints on materialized views, or if the query
planner could analyze the underlying query that creates the materialized
view to infer that certain rows will never match based on the conditions
that created the view.
Are either of these things that could be supported in the future? If
not, is there a better way to get this kind of behavior so that
materialized views are more useful when the amount of data increases and
it's not feasible to update them in their entirety?
Thanks.
-Tony
--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general