Re: out-of-line (TOAST) storage ineffective when loading from dump?

Richard Huxton <dev@xxxxxxxxxxxx> · Tue, 19 Feb 2008 08:58:51 +0000

Markus Bertheau wrote:
Afaics, TOAST was invented so that big attributes wouldn't be in the
way (of readahead, the buffer cache and so on) when working with the
other attributes. This is based on the assumption that the other
attributes are accessed more often than the whole contents of the big
attributes.

Actually, I think it was designed as a way of working around PG's 8KB 
block-size. That imposed a maximum row size of the same and, well meant 
you couldn't have 2 x 5KB text fields for example.

Now I wonder how disk blocks are used when loading a dump with big
text data so that out-of-line storage is used. If disk block usage was
following this pattern:

heap page
toast heap page 1
toast heap page ..
toast heap page n

The "toast" pages are stored in a separate table - see manual for 
details. There's a whole chapter (53.2) on this.

If further the assumption is correct, that the granularity of the
lower level chaches is bigger than the PostgreSQL page size, then that
would mean that loading from a dump destroys the advantage of
out-of-line storage.

I haven't got any numbers to back this theory up.

I'm not sure what you mean by this. The page-cache will cache individual 
pages regardless of type afaik. A large data-load will probably mess up 
your cache for other processes. On the other hand, assuming you'll be 
building indexes etc. too then it's going to be in cache one way or another.

--
  Richard Huxton
  Archonet Ltd

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

              http://www.postgresql.org/docs/faq