Re: Re: [PATCH 0/3] Enable the setting of a kernel module parameter from nfs.conf

"NeilBrown" <neilb@xxxxxxx> · Fri, 21 May 2021 12:39:41 +1000

On Tue, 18 May 2021, Steve Dickson wrote:
> Sorry for the delay... I took some PTO... 
> 
> On 5/12/21 8:29 PM, NeilBrown wrote:
> > On Fri, 16 Apr 2021, Steve Dickson wrote:
> >> Hey Chuck! 
> >>
> >> On 4/14/21 7:26 PM, Chuck Lever III wrote:
> >>> Hi Steve-
> >>>
> >>>> On Apr 14, 2021, at 2:10 PM, Steve Dickson <SteveD@xxxxxxxxxx> wrote:
> >>>>
> >>>> This is a tweak of the patch set Alice Mitchell posted last July [1].
> >>>
> >>> That approach was dropped last July because it is not container-aware.
> >>> It should be simple for someone to write a udev script that uses the
> >>> existing sysfs API that can update nfs4_client_id in a namespace. I
> >>> would prefer the sysfs/udev approach for setting nfs4_client_id,
> >>> since it is container-aware and makes this setting completely
> >>> automatic (zero touch).
> >> As I said in in my cover letter, I see this more as introduction of
> >> a mechanism more than a way to set the unique id. The mechanism being
> >> a way to set kernel module params from nfs.conf. The setting of
> >> the id is just a side effect... 
> > 
> > I wonder if this is the best approach for setting module parameters.
> > 
> > rpc.nfsd already sets grace-time and lease-time - which aren't
> > exactly module parameters, but are similar - using values from nfs.conf.
> > Similarly statd sets /proc/fs/nfs/nlm_tcport based on nfs.conf.
> > 
> > I don't think these things should appear in nfs.conf as "kernel
> > parameters", but as service parameters for the particular service.
> > How they are communicate to the kernel is an internal implementation
> > detail.  Maybe it will involve setting module parameters (at least on
> > older kernels).
> I think I understand you idea of look at thing as "service parameters"
> instead of "kernel parameters", but looking at the actual parameters
> that might be a bit difficult. 
> 
> Some do map to a service like nfs4_disable_idmapping could be set 
> from /etc/idmapd.conf, but things like send_implementation_id or 
> delegation_watermark do not really map to a particular service
> or am I missing something?

There are two "nfs4_disable_idmapping" parameters.  One for server, one
for client.
The server one should, I think, be set by rpc.nfsd based on a setting in
the [nfsd] section of nfs.conf.

The client one should (I think) be set by mount.nfs using whatever
config language we decide is appropriate.

> 
> > 
> > For the "identity" setting, I think it would be best if this were
> > checked and updated by mount.nfs (similar to the way mount.nfs will
> > check if statd is running, and will start it if necessary).  So should
> > it go in nfsmount.conf instead of nfs.conf?? I'm not sure.
> Interesting idea...I would think nfsmount.conf would be the
> right place.

Maybe...  nfsmount.conf is currently only for mount options.  These can
all be per-server or per-mountpoint, or global.
It might make sense to have other things in the global section ...
though it is named "NFSMount_Global_Options" which seems to explicitly
suggest that these are mount options.

I think I lean towards an [nfs] or possibly [mount] section of nfs.conf.

> 
> > 
> > It isn't clear to me where the identity should come from.
> > In some circumstances it might make sense to take it from nfs.conf.
> > In that case we would want to support reading /etc/netnfs/NAME/nfs.conf
> > where NAME was determined in much the same way that "ip netns identify"
> > determines a name.  (Compare inum of /proc/self/ns/net with the inum of
> > each name in /run/netns/).
> I think supporting configs per namespaces is a good idea. I don't
> think it would be too difficult to do since we already support
> the nfs.d directory. 

Yes, reading multiple files should be easy enough once we know what we
want to do.

> 
> 
> > If we did that, we could then support "$netns" in the conf file, and
> > allow
> > 
> >  [nfs]
> >   identity = ${hostname}-${netns}
> > 
> > in /etc/nfs.conf, and it would Do The Right Thing for many cases.
> I'm a bit namespace challenged... but as I see it using 
> "ip netns identify" (w/out the [PID]) would return all of
> the current network network namespaces. Then we would run through 
> the /etc/nfs.conf.d/ directory looking for a matching directory
> for any of the returned namespaces. If found that config
> would be used. Something along those lines? 
> 
> With multiple namespaces, how would we know which one to use? 

(I'm only just coming up to speed on network namespaces too....)

A given process can only be in one network namespace.  If it is in the
initial namespace (same as the 'init' process) then "ip netns identify"
reports nothing.  If in some other namespace, then that namespace is
reported.

So if 'mount.nfs' is run in some other net-namespace, it should let
settings in /etc/netfs/NAME/nfs.conf over-ride settings in /etc/nfs.conf

I'm becoming less enamoured with the idea of using network namespaces to
ensure separate transports are used.  Creating a new namespace means
that either you need a new IP address for that namespace, or you need to
set up NAT so processes in the namespace can access the network.  Both
of these seem like a bit too much overhead just to get an independent
TCP connection (or set of connections) to the server.
I almost want an "NFS namespace" which shares the network but has
separate transports.  I have something like that in our SLE12 kernels
(-o sharetransport=NN) but I'd like a better solution.

Being able to insisting on a separate transport is really useful for
problem analysis, and has other administrative uses.

NeilBrown