Re: [RFC PATCH 0/5] Fun with the multipathing code

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 2017-04-28 at 10:45 -0700, Chuck Lever wrote:
> > On Apr 28, 2017, at 10:25 AM, Trond Myklebust <trond.myklebust@prim
> > arydata.com> wrote:
> > 
> > In the spirit of experimentation, I've put together a set of
> > patches
> > that implement setting up multiple TCP connections to the server.
> > The connections all go to the same server IP address, so do not
> > provide support for multiple IP addresses (which I believe is
> > something Andy Adamson is working on).
> > 
> > The feature is only enabled for NFSv4.1 and NFSv4.2 for now; I
> > don't
> > feel comfortable subjecting NFSv3/v4 replay caches to this
> > treatment yet. It relies on the mount option "nconnect" to specify
> > the number of connections to st up. So you can do something like
> >  'mount -t nfs -overs=4.1,nconnect=8 foo:/bar /mnt'
> > to set up 8 TCP connections to server 'foo'.
> 
> IMO this setting should eventually be set dynamically by the
> client, or should be global (eg., a module parameter).

There is an argument for making it a per-server value (which is what
this patchset does). It allows the admin a certain control to limit the
number of connections to specific servers that are need to serve larger
numbers of clients. However I'm open to counter arguments. I've no
strong opinions yet.

> Since mount points to the same server share the same transport,
> what happens if you specify a different "nconnect" setting on
> two mount points to the same server?

Currently, the first one wins.

> What will the client do if there are not enough resources
> (eg source ports) to create that many? Or is this an "up to N"
> kind of setting? I can imagine a big client having to reduce
> the number of connections to each server to help it scale in
> number of server connections.

There is an arbitrary (compile time) limit of 16. The use of the
SO_REUSEPORT socket option ensures that we should almost always be able
to satisfy that number of source ports, since they can be shared with
connections to other servers.

> Other storage protocols have a mechanism for determining how
> transport connections are provisioned: One connection per
> CPU core (or one CPU per NUMA node) on the client. This gives
> a clear way to decide which connection to use for each RPC,
> and guarantees the reply will arrive at the same compute
> domain that sent the call.

Can we perhaps lay out a case for which mechanisms are useful as far as
hardware is concerned? I understand the socket code is already
affinitised to CPU caches, so that one's easy. I'm less familiar with
the various features of the underlying offloaded NICs and how they tend
to react when you add/subtract TCP connections.

> And of course: RPC-over-RDMA really loves this kind of feature
> (multiple connections between same IP tuples) to spread the
> workload over multiple QPs. There isn't anything special needed
> for RDMA, I hope, but I'll have a look at the SUNRPC pieces.

I haven't yet enabled it for RPC/RDMA, but I imagine you can help out
if you find it useful (as you appear to do).

> Thanks for posting, I'm looking forward to seeing this
> capability in the Linux client.
> 
> 
> > Anyhow, feel free to test and give me feedback as to whether or not
> > this helps performance on your system.
> > 
> > Trond Myklebust (5):
> >  SUNRPC: Allow creation of RPC clients with multiple connections
> >  NFS: Add a mount option to specify number of TCP connections to
> > use
> >  NFSv4: Allow multiple connections to NFSv4.x (x>0) servers
> >  pNFS: Allow multiple connections to the DS
> >  NFS: Display the "nconnect" mount option if it is set.
> > 
> > fs/nfs/client.c             |  2 ++
> > fs/nfs/internal.h           |  2 ++
> > fs/nfs/nfs3client.c         |  3 +++
> > fs/nfs/nfs4client.c         | 13 +++++++++++--
> > fs/nfs/super.c              | 12 ++++++++++++
> > include/linux/nfs_fs_sb.h   |  1 +
> > include/linux/sunrpc/clnt.h |  1 +
> > net/sunrpc/clnt.c           | 17 ++++++++++++++++-
> > net/sunrpc/xprtmultipath.c  |  3 +--
> > 9 files changed, 49 insertions(+), 5 deletions(-)
> > 
> > -- 
> > 2.9.3
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-
> > nfs" in
> > the body of a message to majordomo@xxxxxxxxxxxxxxx
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> --
> Chuck Lever
> 
> 
> 
-- 
Trond Myklebust
Linux NFS client maintainer, PrimaryData
trond.myklebust@xxxxxxxxxxxxxxx
��.n��������+%������w��{.n�����{��w���jg��������ݢj����G�������j:+v���w�m������w�������h�����٥




[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux