Re: MVIEW refresh consistently faster then insert ... select

Kevin Grittner <kgrittn@xxxxxxxxx> · Wed, 4 May 2016 02:06:41 -0500

On Wed, May 4, 2016 at 1:46 AM, Thomas Kellerer <spam_eater@xxxxxxx> wrote:

> I have a table that is an aggregation of another table.
> This aggregation reduces an input of ~14 million rows to ~4
> million rows.

> The refresh takes approx 2 minutes (fastest was 1:40) on our
> development server (CentOS, Postgres 9.5.0)

> However, when I create a materialized view:

> Subsequent refreshs using "REFRESH MATERIALIZED VIEW mv_stock"
> are consistently much faster: between 40 seconds and 1 minute
>
> I have run both refreshs about 10 times now, so caching effects
> should not be there.
>
> My question is: what is refresh mview doing differently then a
> plain insert ... select that it makes that so much faster?

Just to confirm, is this with or without the CONCURRENTLY keyword
on the REFRESH command?

If *without*, I would guess the difference is probably in creating
the index "from scratch" with sort and load versus retail insertion
of index entries.  You could approximate this by dropping the index
before the TRUNCATE and INSERT and creating it again after it is
loaded.

If *with*, I would guess that it is because most of the work is
done in temporary files and workspace, with just the delta applied
to the table and index in permanent storage.

It's hard to guess which way will be faster for the use case you
describe -- it will probably depend on what percentage of rows
remain unchanged on each REFRESH.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general