On Wed, 2021-05-12 at 17:16 +0300, Dan Aloni wrote: > On Wed, May 12, 2021 at 09:49:01AM -0400, Olga Kornievskaia wrote: > > On Wed, May 12, 2021 at 9:40 AM Olga Kornievskaia < > > olga.kornievskaia@xxxxxxxxx> wrote: > > > > > On Wed, May 12, 2021 at 9:37 AM Olga Kornievskaia > > > <olga.kornievskaia@xxxxxxxxx> wrote: > > > > On Wed, May 12, 2021 at 6:42 AM Dan Aloni <dan@xxxxxxxxxxxx> > > > > wrote: > > > > > On Tue, Apr 27, 2021 at 08:12:53AM -0400, Olga Kornievskaia > > > > > wrote: > > > > > > On Tue, Apr 27, 2021 at 12:42 AM Dan Aloni < > > > > > > dan@xxxxxxxxxxxx> wrote: > > > > > > > On Mon, Apr 26, 2021 at 01:19:43PM -0400, Olga > > > > > > > Kornievskaia wrote: > > > > > > > > From: Olga Kornievskaia <kolga@xxxxxxxxxx> > > > > > > > > > > > > > > > > An rpc client uses a transport switch and one ore more > > > > > > > > transports > > > > > > > > associated with that switch. Since transports are > > > > > > > > shared among > > > > > > > > rpc clients, create a symlink into the xprt_switch > > > > > > > > directory > > > > > > > > instead of duplicating entries under each rpc client. > > > > > > > > > > > > > > > > Signed-off-by: Olga Kornievskaia <kolga@xxxxxxxxxx> > > > > > > > > > > > > > > > > .. > > > > > > > > @@ -188,6 +204,11 @@ void > > > > > > > > rpc_sysfs_client_destroy(struct > > > rpc_clnt *clnt) > > > > > > > > struct rpc_sysfs_client *rpc_client = clnt- > > > > > > > > >cl_sysfs; > > > > > > > > > > > > > > > > if (rpc_client) { > > > > > > > > + char name[23]; > > > > > > > > + > > > > > > > > + snprintf(name, sizeof(name), "switch-%d", > > > > > > > > + rpc_client->xprt_switch- > > > > > > > > >xps_id); > > > > > > > > + sysfs_remove_link(&rpc_client->kobject, > > > > > > > > name); > > > > > > > > > > > > > > Hi Olga, > > > > > > > > > > > > > > If a client can use a single switch, shouldn't the name > > > > > > > of the > > > symlink > > > > > > > be just "switch"? This is to be consistent with other > > > > > > > symlinks in > > > > > > > `sysfs` such as the ones in block layer, for example in > > > > > > > my > > > > > > > `/sys/block/sda`: > > > > > > > > > > > > > > bdi -> > > > > > > > ../../../../../../../../../../../virtual/bdi/8:0 > > > > > > > device -> ../../../5:0:0:0 > > > > > > > > > > > > I think the client is written so that in the future it > > > > > > might have more > > > > > > than one switch? > > > > > > > > > > I wonder what would be the use for that, as a switch is > > > > > already > > > collection of > > > > > xprts. Which would determine the switch to use per a new task > > > > > IO? > > > > > > > > > > > > I thought the switch is a collection of xprts of the same type. > > > > And if > > > you wanted to have an RDMA connection and a TCP connection to the > > > same > > > server, then it would be stored under different switches? For > > > instance we > > > round-robin thru the transports but I don't see why we would be > > > doing so > > > between a TCP and an RDMA transport. But I see how a client can > > > totally > > > switch from an TCP based transport to an RDMA one (or a set of > > > transports > > > and round-robin among that set). But perhaps I'm wrong in how I'm > > > thinking > > > about xprt_switch and multipathing. > > > > > > <looks like my reply bounced so trying to resend> > > > > > > > And more to answer your question, we don't have a method to switch > > between > > different xprt switches. But we don't have a way to specify how to > > mount > > with multiple types of transports. Perhaps sysfs could be/would be > > a way to > > switch between the two. Perhaps during session trunking discovery, > > the > > server can return back a list of IPs and types of transports. Say > > all IPs > > would be available via TCP and RDMA, then the client can create a > > switch > > with RDMA transports and another with TCP transports, then perhaps > > there > > will be a policy engine that would decide which one to choose to > > use to > > begin with. And then with sysfs interface would be a way to switch > > between > > the two if there are problems. > > You raise a good point, also relevant to the ability to dynamically > add > new transports on the fly with the sysfs interface - their protocol > type > may be different. > > Returning to the topic of multiple switches per client, I recall that > a > few times it was hinted that there is an intention to have the > implementation details of xprtswitch be modified to be loadable and > pluggable with custom algorithms. And if we are going in that > direction, I'd expect the advanced transport management and request > routing can be below the RPC client level, where we have example > uses: > > 1) Optimizations in request routing that I've previously written > about > (based on request data memory). > > 2) If we lift the restriction of multiple protocol types on the same > xprtswitch on one switch, then we can also allow for the > implementation > 'RDMA-by-default with TCP failover on standby' similar to what you > suggest, but with one switch maintaining two lists behind the scenes. > I'm not that interested in supporting multiple switches per client, or any setup that is asymmetric w.r.t. transport characteristics at this time. Any such setup is going to need a policy engine in order to decide which RPC calls can be placed on which set of transports. That again will end up adding a lot of complexity in the kernel itself. I've yet to see any compelling justification for doing so. -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@xxxxxxxxxxxxxxx