Re: ext4 finally doing the right thing

Aidan Van Dyk <aidan@xxxxxxxxxxx> · Thu, 21 Jan 2010 08:51:29 -0500

* Greg Smith <greg@xxxxxxxxxxxxxxx> [100121 00:58]:
> Greg Stark wrote:
>>
>> That doesn't sound right. The kernel having 10% of memory dirty  
>> doesn't mean there's a queue you have to jump at all. You don't get  
>> into any queue until the kernel initiates write-out which will be  
>> based on the usage counters -- basically a lru. fsync and cousins like  
>> sync_file_range and posix_fadvise(DONT_NEED) in initiate write-out  
>> right away.
>>
>
> Most safe ways ext3 knows how to initiate a write-out on something that  
> must go (because it's gotten an fsync on data there) requires flushing  
> every outstanding write to that filesystem along with it.  So as soon as  
> a single WAL write shows up, bam!  The whole cache is emptied (or at  
> least everything associated with that filesystem), and the caller who  
> asked for that little write is stuck waiting for everything to clear  
> before their fsync returns success.

Sure, if your WAL is on the same FS as your data, you're going to get
hit, and *especially* on ext3...

But, I think that's one of the reasons people usually recommend putting
WAL separate.  Even if it's just another partition on the same (set of)
disk(s), you get the benefit of not having to wait for all the dirty
ext3 pages from your whole database FS to be flushed before the WAL write
can complete on it's own FS.

a.

-- 
Aidan Van Dyk                                             Create like a god,
aidan@xxxxxxxxxxx                                       command like a king,
http://www.highrise.ca/                                   work like a slave.
Attachment:
signature.asc

Description: Digital signature