Re: Connection sharing in SMB multichannel

Shyam Prasad N <nspmangalore@xxxxxxxxx> · Wed, 11 Jan 2023 13:05:32 +0530

Hi Aurélien,

Thanks for the inputs.

On Tue, Jan 10, 2023 at 6:30 PM Aurélien Aptel <aurelien.aptel@xxxxxxxxx> wrote:
>
> Hey Shyam,
>
> I remember thinking that channels should be part of the server too
> when I started working on this but switched it up to session as I kept
> working on it and finding it was the right choice.
> I don't remember all the details so my comments will be a bit vague.

I'm not proposing to move the channels to per-server here, but to move
the connections to per-server. Each channel is associated with a
connection.
I think our use of server and connection interchangeably is causing
this confusion. :)

>
> On Tue, Jan 10, 2023 at 10:16 AM Shyam Prasad N <nspmangalore@xxxxxxxxx> wrote:
> > 1.
> > The way connections are organized today, the connections of primary
> > channels of sessions can be shared among different sessions and their
> > channels. However, connections to secondary channels are not shared.
> > i.e. created with nosharesock.
> > Is there a reason why we have it that way?
> > We could have a pool of connections for a particular server. When new
> > channels are to be created for a session, we could simply pick
> > connections from this pool.
> > Another approach could be not to share sockets for any of the channels
> > of multichannel mounts. This way, multichannel would implicitly mean
> > nosharesock. Assuming that multichannel is being used for performance
> > reasons, this would actually make a lot of sense. Each channel would
> > create new connection to the server, and take advantage of number of
> > interfaces and RSS capabilities of server interfaces.
> > I'm planning to take the latter approach for now, since it's easier.
> > Please let me know about your opinions on this.
>
> First, in the abstract models, Channels are kept in the Session object.
> https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-smb2/8174c219-2224-4009-b96a-06d84eccb3ae
>
> Channels and sessions are intertwined. Channels signing keys depend on
> the session it is connected to (See "3.2.5.3.1 Handling a New
> Authentication" and "3.2.5.3.3 Handling Session Binding").
> Think carefully on what should be done on disconnect/reconnect.
> Especially if the channel is shared with multiple sessions.

That's a good point.
But that affects even other cases like multiuser mounts and cases
where a session shares a socket. Doesn't it?
I think we call cifs_reconnect very generously today. We should give
this another look, IMO.

>
> Problem with the pool approach is mount options might require
> different connections so sharing is not so easy. And reconnecting
> might involve different fallbacks (dfs) for different sessions.
>
> You should see the server struct as the "final destination". Once it's
> picked we know it's going there.
>
> > 2.
> > Today, the interface list for a server hangs off the session struct. Why?
> > Doesn't it make more sense to hang it off the server struct? With my
> > recent changes to query the interface list from the server
> > periodically, each tcon is querying this and keeping the results in
> > the session struct.
> > I plan to move this to the server struct too. And avoid having to
> > query this too many times unnecessarily. Please let me know if you see
> > a reason not to do this.
>
> It's more convenient to have the interface list at the same place as
> the channel list but it could be moved I suppose.
> In the abstract model it's in the server apparently.

Yeah. It makes more sense to keep it on per-server basis.

>
> > 4.
> > I also feel that the way an interface is selected today for
> > multichannel will not scale.
> > We keep selecting the fastest server interface, if it supports RSS.
> > IMO, we should be distributing the requests among the server
> > interfaces, based on the interface speed adveritsed.
>
> RSS means the interface can process packets in parallel queues. The
> problem is we don't know how many queues it has.
> I'm not sure you can find an optimal algorithm for all NIC
> vendor/driver combinations. Probably you need to do some tests with a
> bunch of different HW or find someone knowledgeable.
> From my small experience now at mellanox/nvidia I have yet to see less
> than 8 rx/combined queues. You can get/set the number with ethtool
> -l/-L.
> I've set the max channel connection to 16 at the time but I still
> don't know what large scale high-speed deployment of SMB look like.
> For what it's worth, in the NVMe-TCP tests I'm doing at the moment and
> the systems we use to test (fio reads with a 100gbs eth nic with 63 hw
> queues, 96 cores cpu on the host&target, reading from a null block
> target), we get diminishing returns around 24 parallel connections. I
> don't know how transferable this data point is.
>
> On that topic, for best performance, some possible future project
> could be to assign steering rules on the client to force each channel
> packet processing on different cpus and making sure the cpus are the
> same as the demultiplex thread (avoids context switches and
> contentions). See
> https://www.kernel.org/doc/Documentation/networking/scaling.txt
> (warning, not an easy read lol)

Interesting. I'm guessing that this is the "send side scaling" that's
mentioned here:
https://learn.microsoft.com/en-us/previous-versions/windows/it-pro/windows-server-2012-r2-and-2012/hh997036(v=ws.11)

>
> Cheers,

-- 
Regards,
Shyam