Re: Raid10 and page cache

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Dec 6, 2011 at 2:26 PM, NeilBrown <neilb@xxxxxxx> wrote:
> On Tue, 6 Dec 2011 14:01:14 -0800 Yucong Sun (叶雨飞) <sunyucong@xxxxxxxxx>
> wrote:
>
>> Hi,
>>
>> I recently setup raid10 on 4 physical disk and have a iscsi serve it
>> as a block device, and have been trying to tweak for performance.
>>
>> First thing I notice that MD seems to rely on page cache to flush
>> changes to disk,  is there any way to turn that off so changes are
>> flushed to the disk? like O_FSYNC|O_DIRECT does? The reason I want to
>> turn it off is to understand the performance difference,  I want to be
>> sure that page cache is truly acting as a write-back cache, I know one
>> can tune the dirty_* to control the cache flush, but I want to make
>> sure that it is actually doing what I think it does.
>
> Why do you think this?
>
> md/raid10 sends all request straight through to the relevant underlying
> device(s).
> reads are just passed straight down.
> Writes are duplicated (the request structure, not the data) and queued to a
> separate thread which does the actual write, but it is fairly direct.

So I know there's page caching /flush involved because I watch
/proc/meminfo and see  Dirty value growing up and After reach the
threshold, Write-back kicks in and wrote data.
So if as you said md does no page flushing, then it must because of
the iscsi software opens the device without O_DIRECT, so it uses page
cache which in turn flush data to MD, now it makes more sense.

But for the md write, it's not SYNC write? meaning that after write
call with O_DIRECT to the md device returns, the data is still
possibility on the fly to the disk? how does having a bitmap plays in
between? does it work like ext3 jounal? after a power-loss, can we
expect a crash consistent data on the disk?

Another thing to note is I found IO size on MD device is always 4K,
which is the page size, is that normal? just want to making sure this
isn't a bad behavior result from the iscsi software.
>
>>
>> Then I notice in output of free,  the number in Cache column is very
>> low, however the Buffer is very high, my question is does Buffer here
>> serves as a read cache? I couldn't find the answer anywhere else.
>
> The best place to find the answer is in the source code.
>
> Every page in the page cache is associated with some file.
> If that file is a block device (e.g. /dev/sdX) then it is reported as
> 'Buffer' otherwise it is reported as 'Cache'.
>
> Some filesystems like ext3 uses 'Buffer' memory for metadata but call use
> 'Cache' memory for files and directories.
>

Thanks, it is being used as read cache then, too bad there's no easy
way to measure/see the hit rate.

>>
>> My last question is that since MD seems already doing the cache,  what
>> effect would it have if I want to setup a LO device in front of MD
>> device, Is there going to be more caching, how is different than just
>> plain MD device?
>
> MD/raid10 does no caching.
> A loop-back over the md device would not add extra caching.
>
> NeilBrown
>
>
>>
>> Thanks.
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux