On Mon, 2019-06-03 at 17:07 +0200, Mkrtchyan, Tigran wrote: > > Dear NFS fellows, > > though this is not directly NFS issue, I post this question > here as we mostly affected by NFS clients (and you have enough > kernel connection to route it to the right people). > > We have 25 new data processing nodes with 32 cores, 256 GB RAM and 25 > Gb/s NIC. > They run CentOS 7 (but this is irrelevant, I think). > > When each node runs 24 parallel write incentive (75% write, 25% read) > workloads, we see a spike of > IO errors on close. Client runs into timeout due to slow network or > IO starvation on the NFS servers. > It stumbles, disconnects, establishes a new connection and stumbled > again... You can adjust the pNFS timeout behaviour using the 'dataserver_timeo' and 'dataserver_retrans' module parameters on both the files and flexfiles pNFS driver modules. > > As default values for dirty pages is > > vm.dirty_background_bytes = 0 > vm.dirty_background_ratio = 10 > vm.dirty_bytes = 0 > vm.dirty_ratio = 30 > > the first data get sent when at least 25GB of data is accumulated. > > To get the full deployment more responsive, we have reduced default > numbers to something more reasonable: > > vm.dirty_background_ratio = 0 > vm.dirty_ratio = 0 > vm.dirty_background_bytes = 67108864 > vm.dirty_bytes = 536870912 > > IOW, we force client to start to send data as soon as 64MB is > written. The question is how get this > values optimal and how make them file system/mount point specific. The memory management system knows nothing about mount points, and the filesystems know nothing about the memory management limits. That is by design. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx