RE: Tuning NFS client write pagecache

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




>-----Original Message-----
>From: linux-nfs-owner@xxxxxxxxxxxxxxx [mailto:linux-nfs-
>owner@xxxxxxxxxxxxxxx] On Behalf Of Chuck Lever
>Sent: Tuesday, August 10, 2010 9:27 AM
>To: Peter Chacko
>Cc: Trond Myklebust; Jim Rees; Matthew Hodgson; linux-nfs@xxxxxxxxxxxxxxx
>Subject: Re: Tuning NFS client write pagecache
>
>
>On Aug 6, 2010, at 9:15 PM, Peter Chacko wrote:
>
>> I think you are not understanding the use case of a  file-system wide,
>> non-cached IO for NFS.
>>
>> Imagine a case when a unix shell programmer  create a backup
>> script,who doesn't know C programming or system calls....he just wants
>> to use a  cp -R sourcedir  /targetDir.  Where targetDir is an NFS
>> mounted share.
>>
>> How can we use programmatical , per file-session interface to O_DIRECT
>> flag here ?
>>
>> We need a file-system wide direct IO mechanisms ,the best place to
>> have is at the mount time. We cannot tell all sysadmins to go and
>> learn programming....or backup vendors to change their code that they
>> wrote 10 - 12 years ago...... Operating system functionalities should
>> cover a large audience, with different levels of  training/skills.
>>
>> I hope you got my point here....
>
>The reason Linux doesn't support a filesystem wide option is that direct
>I/O has as much potential to degrade performance as it does to improve it.
>The performance degradation can affect other applications on the same file
>system and other clients connected to the same server.  So it can be an
>exceptionally unfriendly thing to do for your neighbors if an application
>is stupid or malicious.

Please forgive my ignorance, but could you give a example or two?  I can understand how direct I/O can degrade the performance of the application that is using it.  But I can't see how other applications' performance would be affected.  Unless maybe it would increase the network traffic due to the lack of write consolidation.  I can see that:  many small writes instead of one larger one.

I don't need details, just a couple of sketchy examples so I can visualize what you are referring to.

Thanks for increasing my understanding,

-=# Paul Gilliam #=-


>To make direct I/O work well, applications have to use it sparingly and
>appropriately.  They usually maintain their own buffer cache in lieu of the
>client's generic page cache.  Applications like shells and editors depend
>on an NFS client's local page cache to work well.
>
>So, we have chosen to support direct I/O only when each file is opened, not
>as a file system wide option.  This is a much narrower application of this
>feature, and has a better chance of helping performance in special cases
>while not destroying it broadly.
>
>So far I haven't read anything here that clearly states a requirement we
>have overlooked in the past.
>
>For your "cp" example, the NFS community is looking at ways to reduce the
>overhead of file copy operations by offloading them to the server.  The
>file data doesn't have to travel over the network to the client.  Someone
>recently said when you leave this kind of choice up to users, they will
>usually choose exactly the wrong option.  This is a clear case where the
>system and application developers will choose better than users who have no
>programming skills.
>
>
>> On Sat, Aug 7, 2010 at 1:09 AM, Trond Myklebust
>> <trond.myklebust@xxxxxxxxxx> wrote:
>>> On Sat, 2010-08-07 at 00:59 +0530, Peter Chacko wrote:
>>>> Imagine a third party backup app for which a customer has no source
>>>> code. (that doesn't use open system call O_DIRECT mode) backing up
>>>> millions of files through NFS....How can we do a non-cached IO to the
>>>> target server ?  we cannot use O_DIRECT option here as we don't have
>>>> the source code....If we have mount option, its works just right
>>>> ....if we can have read-only mounts, why not have a dio-only mount ?
>>>>
>>>> A true application-Yaware storage systems(in this case NFS client) ,
>>>> which is the next generation storage systems should do, should absorb
>>>> the application needs that may apply to the whole FS....
>>>>
>>>> i don't say O_DIRECT flag is a bad idea, but it will only work with a
>>>> regular application that do IO to some files.....this is not the best
>>>> solution when NFS server is used as the storage for secondary data,
>>>> where NFS client runs third party applications thats otherwise run
>>>> best in a local storage as there is no caching issues....
>>>>
>>>> What do you think ?
>>>
>>> I think that we've had O_DIRECT support in the kernel for more than six
>>> years now. If there are backup vendors out there that haven't been
>>> paying attention, then I'd suggest looking at other vendors.
>>>
>>> Trond
>>>
>>>> On Fri, Aug 6, 2010 at 11:07 PM, Trond Myklebust
>>>> <trond.myklebust@xxxxxxxxxx> wrote:
>>>>> On Fri, 2010-08-06 at 15:05 +0100, Peter Chacko wrote:
>>>>>> Some distributed file systems such as IBM's SANFS, support direct IO
>>>>>> to the target storage....without going through a cache... ( This
>>>>>> feature is useful, for write only work load....say, we are backing up
>>>>>> huge data to an NFS share....).
>>>>>>
>>>>>> I think if not available, we should add a DIO mount option, that tell
>>>>>> the VFS not to cache any data, so that close operation will not
>stall.
>>>>>
>>>>> Ugh no! Applications that need direct IO should be using
>open(O_DIRECT),
>>>>> not relying on hacks like mount options.
>>>>>
>>>>>> With the open-to-close , cache coherence protocol of NFS, an
>>>>>> aggressive caching client, is a performance downer for many work-
>loads
>>>>>> that is write-mostly.
>>>>>
>>>>> We already have full support for vectored aio/dio in the NFS for those
>>>>> applications that want to use it.
>>>>>
>>>>> Trond
>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Aug 6, 2010 at 2:26 PM, Jim Rees <rees@xxxxxxxxx> wrote:
>>>>>>> Matthew Hodgson wrote:
>>>>>>>
>>>>>>>  Is there any way to tune the linux NFSv3 client to prefer to write
>>>>>>>  data straight to an async-mounted server, rather than having large
>>>>>>>  writes to a file stack up in the local pagecache before being
>synced
>>>>>>>  on close()?
>>>>>>>
>>>>>>> It's been a while since I've done this, but I think you can tune
>this with
>>>>>>> vm.dirty_writeback_centisecs and vm.dirty_background_ratio sysctls.
>The
>>>>>>> data will still go through the page cache but you can reduce the
>amount that
>>>>>>> stacks up.
>>>>>>>
>>>>>>> There are other places where the data can get buffered, like the rpc
>layer,
>>>>>>> but it won't sit there any longer than it takes for it to go out the
>wire.
>>>>>>> --
>>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-nfs"
>in
>>>>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>>>
>>>>>> --
>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-nfs"
>in
>>>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>
>>>>>
>>>>>
>>>>>
>>>
>>>
>>>
>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>--
>Chuck Lever
>chuck[dot]lever[at]oracle[dot]com
>
>
>
>--
>To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>the body of a message to majordomo@xxxxxxxxxxxxxxx
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux