Re: Materializing the relation

Gregory Stark <stark@xxxxxxxxxxxxxxxx> · Mon, 11 Jun 2007 18:17:55 +0100

>>                            QUERY PLAN
>> ----------------------------------------------------------------
>>  Nested Loop  (cost=1.03..2.34 rows=3 width=24)
>>    Join Filter: (tb1.c1 = tb2.c1)
>>    ->  Seq Scan on tb2  (cost=0.00..1.04 rows=4 width=12)
>>    ->  Materialize  (cost=1.03..1.06 rows=3 width=12)
>>          ->  Seq Scan on tb1  (cost= 0.00..1.03 rows=3 width=12)
>>
>>
>> In this above plan, what does 'Material' mean?

It means it will read it all in to memory (or disk if it runs out of memory)
and use that instead of referring to the original table subsequently. In this
case it will only make a small difference in performance because the temporary
storage will be very similar to the original table. It just allows Postgres to
pack the data a bit more densely and skip MVCC visibility checks.

Other times Materialize nodes are used where they make a bigger difference are
when the plan beneath it is quite expensive and we don't want to have to
re-execute it more than necessary.

"Shoaib Mir" <shoaibmir@xxxxxxxxx> writes:

> From /src/include/utils/tuplestore.h
>
> "The materialize shields the sort from the need to do mark/restore and
> thereby allows it to perform its final merge pass on-the-fly; while the
> materialize itself is normally cheap since it won't spill to disk unless the
> number of tuples with equal key values exceeds work_mem"

This comment actually refers to a new optimization which isn't in the released
8.2 yet. It introduces a Materialize node above a sort to allow the sort to
skip the final merge step. Instead it merges as the query runs and the
Materialize throws away data which isn't needed any more since merge joins
only need the current key value and they don't need to refer back to previous
key values.

-- 
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com