Search Postgresql Archives

Re: out-of-line (TOAST) storage ineffective when loading from dump?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Markus Bertheau wrote:
2008/2/19, Richard Huxton <dev@xxxxxxxxxxxx>:
I'm loading a table with some short attributes and a large toastable attribute.
That means that for every main table heap page several toast table heap pages
are written. This happens through the buffer cache and the background writer,
so maybe the pages aren't written in the order in which they were created in
the buffer cache, but if they are, they end up on disk (assuming that the file
system is not fragmented) roughly like that:

main table heap page 1
toast table heap page 1
toast table heap page .
toast table heap page n
main table heap page 2
toast table heap page n+1
toast table heap page .
toast table heap page 2n

Well, that's assuming:
1. You're not re-using space from previously deleted/updated rows.
2. You've not got a RAID array striping writes over multiple disks
3. The underlying filesystem + buffering isn't doing anything too clever.

Significantly later a sequential scan of the table has to be made, the
toastable attribute is not needed for the operation. The caches are cold or
otherwise occupied.

OK

> If the granularity of caches that are nearer to the disk in
the cache hierarchy than the PG buffer cache is higher than the PG page size
(i.e. a cache unit is bigger than the PG page size), then every read of a main
table heap page will inescapably read some toast table heap pages into the
cache (whichever cache that may be).

I think disk blocks on your more common file-systems are 4KB by default (ext2/3 and ntfs for example). I'm not aware of any default disk-block sizes more than the 8KB page-size of PG. Of course, the OS may read ahead if it sees you scanning, but it will do that on a file basis.

> If all the main table heap pages were
laid out adjecently on disk, they could be read faster and caches be polluted
less.

True enough. The key word there though is "if" - it means that PG is trying to out-think the filesystem, OS and hardware.

It should be easy enough to test on a particular system though.
1. pg_restore a sample table with TOASTed data.
2. pg_restore the same data but no TOASTed data.
3. cluster the table with TOASTed data (which should force a rewrite of the whole table but not its TOASTed data)

If the timing of various selects differ hugely then there's something worth investigating.

--
  Richard Huxton
  Archonet Ltd

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
      choose an index scan if your joining column's datatypes do not
      match

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux