Re: [PATCH 00/16] Cache open file descriptors in knfsd

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Jul 1, 2019, at 11:17 AM, Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> wrote:
> 
> On Mon, 2019-07-01 at 11:02 -0400, Chuck Lever wrote:
>> Interesting work! Kudos to you and Jeff.
>> 
>> 
>>> On Jun 30, 2019, at 9:52 AM, Trond Myklebust <trondmy@xxxxxxxxx>
>>> wrote:
>>> 
>>> When a NFSv3 READ or WRITE request comes in, the first thing knfsd
>>> has
>>> to do is open a new file descriptor. While this is often a
>>> relatively
>>> inexpensive thing to do for most local filesystems, it is usually
>>> less
>>> so for FUSE, clustered or networked filesystems that are being
>>> exported
>>> by knfsd.
>> 
>> True, I haven't measured much effect if any of open and close on
>> local
>> file systems. It would be valuable if the cover letter provided a
>> more
>> quantified assessment of the cost for these other use cases. It
>> sounds
>> plausible to me that they would be more expensive, but I'm wondering
>> if
>> the additional complexity of an open file cache is warranted and
>> effective. Do you have any benchmark results to share?
>> 
>> Are there particular workloads where you believe open caching will be
>> especially beneficial?
> 
> I'd expect pretty much anything with a nontrivial open() method. i.e.:
> FUSE, GFS2, OCFS2, CEPH, etc. to benefit.
> 
> I've seen no slowdowns so far with traditional filesystems: i.e. ext4
> and xfs.
> 
> Note that the removal of the raparms cache does in many way compensate
> for the new need to lookup the struct file.
> 
>>> This set of patches attempts to reduce some of that cost by caching
>>> open file descriptors so that they may be reused by other incoming
>>> READ/WRITE requests for the same file.
>> 
>> Is the open file cache a single cache per server? Wondering if there
>> can be significant interference (eg lock contention or cache
>> sloshing)
>> between separate workloads on different exports, for example.
> 
> The file cache is global. Cache lookups are lockless (i.e. RCU
> protected), so there is little contention for the case where there is
> already an entry. For the case where we have to add an entry, there is
> a mutex that might get contended in the case of workloads with lots of
> small file open+closes.

>> Do you have any benchmark results that show that removing the raparms
>> cache is harmless?
> 
> The same information is carried in struct file. The whole raparms cache
> was just a hack in order to allow us to port the readahead information
> across struct file instances. Now that we are caching the struct file
> itself, the raparms hack is unnecessary.

OK. I see the patch description of 11/16 mentions something about "stop
fiddling with raparms" but IMO the patch description for 13/16 should be
changed to make the above clear. Thanks!


> IOW: I haven't seen any slowdowns so far, however I don't have access
> to a bleeding edge networking setup that would push this further.
> 
>>> One danger when doing this, is that knfsd may end up caching file
>>> descriptors for files that have been unlinked. In order to deal
>>> with
>>> this issue, we use fsnotify to monitor the files, and have hooks to
>>> evict those descriptors from the file cache if the i_nlink value
>>> goes to 0.
>>> 
>>> Jeff Layton (12):
>>> sunrpc: add a new cache_detail operation for when a cache is
>>> flushed
>>> locks: create a new notifier chain for lease attempts
>>> nfsd: add a new struct file caching facility to nfsd
>>> nfsd: hook up nfsd_write to the new nfsd_file cache
>>> nfsd: hook up nfsd_read to the nfsd_file cache
>>> nfsd: hook nfsd_commit up to the nfsd_file cache
>>> nfsd: convert nfs4_file->fi_fds array to use nfsd_files
>>> nfsd: convert fi_deleg_file and ls_file fields to nfsd_file
>>> nfsd: hook up nfs4_preprocess_stateid_op to the nfsd_file cache
>>> nfsd: have nfsd_test_lock use the nfsd_file cache
>>> nfsd: rip out the raparms cache
>>> nfsd: close cached files prior to a REMOVE or RENAME that would
>>>   replace target
>>> 
>>> Trond Myklebust (4):
>>> notify: export symbols for use by the knfsd file cache
>>> vfs: Export flush_delayed_fput for use by knfsd.
>>> nfsd: Fix up some unused variable warnings
>>> nfsd: Fix the documentation for svcxdr_tmpalloc()
>>> 
>>> fs/file_table.c                  |   1 +
>>> fs/locks.c                       |  62 +++
>>> fs/nfsd/Kconfig                  |   1 +
>>> fs/nfsd/Makefile                 |   3 +-
>>> fs/nfsd/blocklayout.c            |   3 +-
>>> fs/nfsd/export.c                 |  13 +
>>> fs/nfsd/filecache.c              | 885
>>> +++++++++++++++++++++++++++++++
>>> fs/nfsd/filecache.h              |  60 +++
>>> fs/nfsd/nfs4layouts.c            |  12 +-
>>> fs/nfsd/nfs4proc.c               |  83 +--
>>> fs/nfsd/nfs4state.c              | 183 ++++---
>>> fs/nfsd/nfs4xdr.c                |  31 +-
>>> fs/nfsd/nfssvc.c                 |  16 +-
>>> fs/nfsd/state.h                  |  10 +-
>>> fs/nfsd/trace.h                  | 140 +++++
>>> fs/nfsd/vfs.c                    | 295 ++++-------
>>> fs/nfsd/vfs.h                    |   9 +-
>>> fs/nfsd/xdr4.h                   |  19 +-
>>> fs/notify/fsnotify.h             |   2 -
>>> fs/notify/group.c                |   2 +
>>> fs/notify/mark.c                 |   6 +
>>> include/linux/fs.h               |   5 +
>>> include/linux/fsnotify_backend.h |   2 +
>>> include/linux/sunrpc/cache.h     |   1 +
>>> net/sunrpc/cache.c               |   3 +
>>> 25 files changed, 1465 insertions(+), 382 deletions(-)
>>> create mode 100644 fs/nfsd/filecache.c
>>> create mode 100644 fs/nfsd/filecache.h
>>> 
>>> -- 
>>> 2.21.0
>>> 
>> 
>> --
>> Chuck Lever
>> 
>> 
>> 
> -- 
> Trond Myklebust
> Linux NFS client maintainer, Hammerspace
> trond.myklebust@xxxxxxxxxxxxxxx

--
Chuck Lever







[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux