On Thu, May 13, 2021 at 11:18 AM Chuck Lever III <chuck.lever@xxxxxxxxxx> wrote: > > > > > On May 13, 2021, at 11:13 AM, Trond Myklebust <trondmy@xxxxxxxxxxxxxxx> wrote: > > > > On Wed, 2021-05-12 at 17:16 +0300, Dan Aloni wrote: > >> On Wed, May 12, 2021 at 09:49:01AM -0400, Olga Kornievskaia wrote: > >>> On Wed, May 12, 2021 at 9:40 AM Olga Kornievskaia < > >>> olga.kornievskaia@xxxxxxxxx> wrote: > >>> > >>>> On Wed, May 12, 2021 at 9:37 AM Olga Kornievskaia > >>>> <olga.kornievskaia@xxxxxxxxx> wrote: > >>>>> On Wed, May 12, 2021 at 6:42 AM Dan Aloni <dan@xxxxxxxxxxxx> > >>>>> wrote: > >>>>>> On Tue, Apr 27, 2021 at 08:12:53AM -0400, Olga Kornievskaia > >>>>>> wrote: > >>>>>>> On Tue, Apr 27, 2021 at 12:42 AM Dan Aloni < > >>>>>>> dan@xxxxxxxxxxxx> wrote: > >>>>>>>> On Mon, Apr 26, 2021 at 01:19:43PM -0400, Olga > >>>>>>>> Kornievskaia wrote: > >>>>>>>>> From: Olga Kornievskaia <kolga@xxxxxxxxxx> > >>>>>>>>> > >>>>>>>>> An rpc client uses a transport switch and one ore more > >>>>>>>>> transports > >>>>>>>>> associated with that switch. Since transports are > >>>>>>>>> shared among > >>>>>>>>> rpc clients, create a symlink into the xprt_switch > >>>>>>>>> directory > >>>>>>>>> instead of duplicating entries under each rpc client. > >>>>>>>>> > >>>>>>>>> Signed-off-by: Olga Kornievskaia <kolga@xxxxxxxxxx> > >>>>>>>>> > >>>>>>>>> .. > >>>>>>>>> @@ -188,6 +204,11 @@ void > >>>>>>>>> rpc_sysfs_client_destroy(struct > >>>> rpc_clnt *clnt) > >>>>>>>>> struct rpc_sysfs_client *rpc_client = clnt- > >>>>>>>>>> cl_sysfs; > >>>>>>>>> > >>>>>>>>> if (rpc_client) { > >>>>>>>>> + char name[23]; > >>>>>>>>> + > >>>>>>>>> + snprintf(name, sizeof(name), "switch-%d", > >>>>>>>>> + rpc_client->xprt_switch- > >>>>>>>>>> xps_id); > >>>>>>>>> + sysfs_remove_link(&rpc_client->kobject, > >>>>>>>>> name); > >>>>>>>> > >>>>>>>> Hi Olga, > >>>>>>>> > >>>>>>>> If a client can use a single switch, shouldn't the name > >>>>>>>> of the > >>>> symlink > >>>>>>>> be just "switch"? This is to be consistent with other > >>>>>>>> symlinks in > >>>>>>>> `sysfs` such as the ones in block layer, for example in > >>>>>>>> my > >>>>>>>> `/sys/block/sda`: > >>>>>>>> > >>>>>>>> bdi -> > >>>>>>>> ../../../../../../../../../../../virtual/bdi/8:0 > >>>>>>>> device -> ../../../5:0:0:0 > >>>>>>> > >>>>>>> I think the client is written so that in the future it > >>>>>>> might have more > >>>>>>> than one switch? > >>>>>> > >>>>>> I wonder what would be the use for that, as a switch is > >>>>>> already > >>>> collection of > >>>>>> xprts. Which would determine the switch to use per a new task > >>>>>> IO? > >>>>> > >>>>> > >>>>> I thought the switch is a collection of xprts of the same type. > >>>>> And if > >>>> you wanted to have an RDMA connection and a TCP connection to the > >>>> same > >>>> server, then it would be stored under different switches? For > >>>> instance we > >>>> round-robin thru the transports but I don't see why we would be > >>>> doing so > >>>> between a TCP and an RDMA transport. But I see how a client can > >>>> totally > >>>> switch from an TCP based transport to an RDMA one (or a set of > >>>> transports > >>>> and round-robin among that set). But perhaps I'm wrong in how I'm > >>>> thinking > >>>> about xprt_switch and multipathing. > >>>> > >>>> <looks like my reply bounced so trying to resend> > >>>> > >>> > >>> And more to answer your question, we don't have a method to switch > >>> between > >>> different xprt switches. But we don't have a way to specify how to > >>> mount > >>> with multiple types of transports. Perhaps sysfs could be/would be > >>> a way to > >>> switch between the two. Perhaps during session trunking discovery, > >>> the > >>> server can return back a list of IPs and types of transports. Say > >>> all IPs > >>> would be available via TCP and RDMA, then the client can create a > >>> switch > >>> with RDMA transports and another with TCP transports, then perhaps > >>> there > >>> will be a policy engine that would decide which one to choose to > >>> use to > >>> begin with. And then with sysfs interface would be a way to switch > >>> between > >>> the two if there are problems. > >> > >> You raise a good point, also relevant to the ability to dynamically > >> add > >> new transports on the fly with the sysfs interface - their protocol > >> type > >> may be different. > >> > >> Returning to the topic of multiple switches per client, I recall that > >> a > >> few times it was hinted that there is an intention to have the > >> implementation details of xprtswitch be modified to be loadable and > >> pluggable with custom algorithms. And if we are going in that > >> direction, I'd expect the advanced transport management and request > >> routing can be below the RPC client level, where we have example > >> uses: > >> > >> 1) Optimizations in request routing that I've previously written > >> about > >> (based on request data memory). > >> > >> 2) If we lift the restriction of multiple protocol types on the same > >> xprtswitch on one switch, then we can also allow for the > >> implementation > >> 'RDMA-by-default with TCP failover on standby' similar to what you > >> suggest, but with one switch maintaining two lists behind the scenes. > >> > > > > I'm not that interested in supporting multiple switches per client, or > > any setup that is asymmetric w.r.t. transport characteristics at this > > time. > > > > Any such setup is going to need a policy engine in order to decide > > which RPC calls can be placed on which set of transports. That again > > will end up adding a lot of complexity in the kernel itself. I've yet > > to see any compelling justification for doing so. > > I agree -- SMB multi-channel allows TCP+RDMA configurations, and its > tough to decide how to distribute work across connections and NICs > that have such vastly different performance characteristics. > > I would like to see us crawling and walking before trying to run. Sounds good folks. I'll remove the multiple switches from the sysfs infrastructure. v7 coming up. > > > -- > Chuck Lever > > >