Search Postgresql Archives

Re: ZFS prefetch considered evil?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Jul 9, 2009, at 3:53 AM, Yaroslav Tykhiy wrote:

On 08/07/2009, at 8:39 PM, Alban Hertroys wrote:

On Jul 8, 2009, at 2:50 AM, Yaroslav Tykhiy wrote:
IIRC prefetch tries to keep data (disk blocks?) in memory that it fetched recently.

What you described is just a disk cache. And a trivial implementation of prefetch would work as follows: An application or other file/disk consumer asks the provider (driver, kernel, whatever) to read, say, 2 disk blocks worth of data. The provider thinks, "I know you are short-sighted; I bet you are going to ask for more contiguous blocks very soon," so it schedules a disk read for many more contiguous blocks than requested and caches them in RAM. For bulk data applications such as file serving this trick works as a charm. But other applications do truly random access and they never come back after the prefetched blocks; in this case both disk bandwidth and cache space are wasted. An advanced implementation can try to distinguish sequential and random access patterns, but in reality it appears to be a challenging task.

Ah yes, thanks for the correction, I now remember reading about that before. Makes the name 'prefetch' that more fitting, doesn't it?

And as you say, it's not that useful a feature with random access (hadn't thought about that); in fact, I can imagine that it might delay moving the disk-heads to the next desired (random) position as the FS is still requesting data that it isn't going to be needing (except for some lucky cases) - unless it manages to detect the randomness of the access patterns. You can't predict randomness from just read requests of course, you don't know about the requests that are still to come. You can however assume something like that is the case if historic requests turned out to be random by nature, but then you'd want to know for which area of the FS this is the case.

I don't know how you partitioned your zpools, but to me it seems like it'd be preferable to have the PostgreSQL tablespaces (and possibly other data that's likely to be accessed randomly) in a separate zpool from the rest of the system so you can restrict disabling prefetch to just that file-system. You probably already did that...

It could be interesting to see how clustering the relevant tables would affect the prefetch performance, I'd expect disk access to be less random that way. It's probably still better to disable prefetch though.

ZFS uses quite a bit of memory, so if you distributed all your memory to be used by just postgres and disk cache then you didn't leave enough space for the prefetch data and _something_ will be moved to swap.

I hope you know that FreeBSD is exceptionally good at distributing available memory between its consumers. That said, useless prefetch indeed puts extra pressure on disk cache and results in unnecessary cache evictions, thus making things even worse. It is true that ZFS is memory hungry and so rather sensitive to non-optimal memory use patterns. Useless prefetch wastes memory that could be used to speed up other ZFS operations.

Yes, I do know that, it's one of the reasons I prefer it over other OSs. The keyword here was 'available memory' though, under the assumption that something was hitting swap. But apparently that wasn't the case.

You'll probably want to ask about this on the FreeBSD mailing lists as well, they'll know much better than I do ;)

Are you a local FreeBSD expert? ;-) Jokes apart, I don't think this topic has to do with FreeBSD as such; it is mostly about making the advanced technologies of Postgresql and ZFS go well together. Even ZFS developers admit that in database related applications exceptions from general ZFS practices and rules may be called for.

I wouldn't call myself an expert, I just use it on a few systems at home and am more a user than an administrator. I do read the stable/ current mailing lists though (since 2004 according to my mail client) and keep an eye on (among others) the ZFS discussions as I feel tempted to change my gmirrors into zpools some day. It certainly looks like an interesting FS, very flexible and reliable.

Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.


!DSPAM:737,4a55e49a10131296212767!



--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux