[PATCH v1 00/16] nfsd: duplicate reply cache overhaul

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Our QA group has been reporting on and off for the last several years
about occasional failures in testing, especially on UDP. When we go to
look at traces, we see a missing reply from a server on a non-idempotent
request. The client then retransmits the request and the server tries to
redo it instead of just sending the DRC entry.

With an instrumented kernel on the server and a synthetic reproducer, we
found that it's quite easy to hammer the server so fast the DRC entries
get flushed out long before a retransmit can come in.

This patchset is a first pass at fixing this. Instead of simply keeping
a cache of the last 1024 entries, it allows nfsd to grow and shrink the
DRC dynamically.

The first patch is a bugfix for IPv6 support. The next several are
cleanups and reorganizations of the existing code. The tenth patch makes
them dynamically allocated, and the ones following that add various
mechanisms to help keep the cache to a manageable size. The final patch
adds the ability to checksum the first part the request, intended as a
way to mitigate the effects of an XID collision.

While most of us will probably say "so what" when it comes to UDP
failures, it's a potential problem on connected transports as well. I'm
also inclined to try and fix things that screw up the people that are
helping us test our code.

I'd like to see this merged for 3.9 if possible...

Jeff Layton (16):
  nfsd: fix IPv6 address handling in the DRC
  nfsd: remove unneeded spinlock in nfsd_cache_update
  nfsd: get rid of RC_INTR
  nfsd: create a dedicated slabcache for DRC entries
  nfsd: add alloc and free functions for DRC entries
  nfsd: remove redundant test from nfsd_reply_cache_free
  nfsd: clean up and clarify the cache expiration code
  nfsd: break out hashtable search into separate function
  nfsd: always move DRC entries to the end of LRU list when updating
    timestamp
  nfsd: dynamically allocate DRC entries
  nfsd: remove the cache_disabled flag
  nfsd: when updating an entry with RC_NOCACHE, just free it
  nfsd: add recurring workqueue job to clean the cache
  nfsd: track the number of DRC entries in the cache
  nfsd: register a shrinker for DRC cache entries
  nfsd: keep a checksum of the first 256 bytes of request

 fs/nfsd/cache.h             |  17 ++-
 fs/nfsd/nfscache.c          | 337 ++++++++++++++++++++++++++++++++++----------
 fs/nfsd/nfssvc.c            |   1 -
 include/linux/sunrpc/clnt.h |   4 +-
 4 files changed, 278 insertions(+), 81 deletions(-)

-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux