Re: Question on tuning sunrpc.tcp_slot_table_entries

Chuck Lever <chuck.lever@xxxxxxxxxx> · Wed, 3 Jul 2013 11:11:05 -0400

Hi Jeff-

On Jul 1, 2013, at 4:54 PM, Jeff Wright <Jeff.Wright@xxxxxxxxxx> wrote:

> Team,
> 
> I am supporting Oracle MOS note 1354980.1, which covers tuning clients for RMAN backup to the ZFS Storage Appliance.  One of the tuning recommendations is to change sunrpc.tcp_slot_table_entries from the default (16) to 128 to open up the number of concurrent I/O we can get per client mount point.  This is presumed good for general-purpose kernel NFS application traffic to the ZFS Storage Appliance.  I recently received the following comment regarding the efficacy of the sunrpc.tcp_slot_table_entries tune:
> 
> "In most cases, the parameter "sunrpc.tcp_slot_table_entries" can not be set even if applying int onto /etc/sysctl.conf although this document says users should do so.
> Because, the parameter is appeared after sunrpc.ko module is loaded(=NFS service is started), and sysctl was executed before starting NFS service."

I believe that assessment is correct.  It is also true that setting sunrpc.tcp_slot_table_entries has no effect on existing NFS mounts.  The value of this setting is copied each time a new RPC transport is created, and not referenced again.

A better approach might be to specify this setting via a module parameter, so it is set immediately whenever the sunrpc.ko module is loaded.  I haven't tested this myself.

The exact mechanism for hard-wiring a module parameter varies among distributions, but OL6 has the /etc/modprobe.d/ directory, where a .conf file can be added.  Something like this:

  sudo echo "options sunrpc tcp_slot_table_entries=128" > /etc/modprobe.d/sunrpc.conf

Then reboot, of course.

In more recent versions of the kernel, the maximum number of RPC slots is determined dynamically.  Looks like commit d9ba131d "SUNRPC: Support dynamic slot allocation for TCP connections", Sun Jul 17 18:11:30 2011, is the relevant commit.

That commit appeared upstream in kernel 3.1.  Definitely not in Oracle's UEKr1 or UEKr2 kernels.  No idea about recent RHEL/OL 6 updates, but I suspect not.

However, you might expect to see this feature in distributions containing more recent kernels, like RHEL 7; or it probably appears in the UEKr3 kernel (alphas are based on much more recent upstream kernels).

> I'd like to find out how to tell if the tune is actually in play for the running kernel and if there is a difference in what is reported /proc compared to what is running in core.

The nfsiostats command reports the size of the RPC backlog queue, which is a measure of whether the RPC slot table size is starving requests.  There are certain operations (WRITE, for example) which will have a long queue no matter what.

I can't think of a way of directly observing the slot table size in use for a particular mount.  That's been a perennial issue with this feature.

> Could anyone on the alias suggest how to validate if the aforementioned comment is relevant for the Linux kernel I am running with?  I am familiar with using mdb on Solaris to check what values the Solaris kernel is running with, so if there is a Linux equivalent, or another way to do this sort of thing with Linux, please let me know.

-- 
Chuck Lever
chuck[dot]lever[at]oracle[dot]com

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html