Re: Connection sharing in SMB multichannel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Tom,

As always, thanks for the detailed review. :)

On Wed, Jan 11, 2023 at 10:32 PM Tom Talpey <tom@xxxxxxxxxx> wrote:
>
> On 1/10/2023 4:16 AM, Shyam Prasad N wrote:
> > 3.
> > I saw that there was a bug in iface_cmp, where we did not do full
> > comparison of addresses to match them.
> > Fixed it here:
> > https://github.com/sprasad-microsoft/smb3-kernel-client/commit/cef2448dc43d1313571e21ce8283bccacf01978e.patch
> >
> > @Tom Talpey Was this your concern with iface_cmp?
>
> Took a look at this and I do have some comments.
>
> Regarding the new address comparator cifs_ipaddr_cmp(), why
> does it return a three-value result? It seems to prefer
> AF_INET over AF_UNSPEC, and AF_INET6 over AF_INET. Won't
> this result in selecting the same physical interface more
> than once?

The idea is to return consistent values for a given pair, so that, for
interfaces with exactly same advertised speeds, they're ordered based
on address family first, then based on what memcmp returns.

>
> Also, it is comparing the entire contents of the sockaddrs,
> including padding and scope id's, which have no meaning on
> this end of the wire. That will lead to mismatch.

My familiarity with IPv6 is relatively low. Steve actually pointed me
to NFS client code which does a similar comparison. Let me redo this
part.

>
>
> Regarding the interface comparator, which I'll requote here:
>
> +/*
> + * compare two interfaces a and b
> + * return 0 if everything matches.
>
> This is fine, assuming the address comparator is fixed. Matching
> everything is the best result.
>
> + * return 1 if a has higher link speed, or rdma capable, or rss capable
>
> I'm still uneasy about selecting link speed first. If the mount
> specifies RDMA, I think RDMA should be the preferred parameter.
> The code you propose would select an ordinary interface, if it
> were one bps faster.

I could do that. But that only changes the order in which the
interfaces are ordered.
I think the place where you want this change more is where the
interface is actually getting selected for a new connection.
I'll do this.

>
> + * return -1 otherwise.
>
> Ok on this. :)
>
> + */
> +static int
> +iface_cmp(struct cifs_server_iface *a, struct cifs_server_iface *b)
> +{
> +       int cmp_ret = 0;
> +
> +       WARN_ON(!a || !b);
> +       if (a->speed == b->speed) {
> +               if (a->rdma_capable == b->rdma_capable) {
> +                       if (a->rss_capable == b->rss_capable) {
>
> RSS is meaningless on an RDMA adapter. The RDMA kernel API uses
> Queue Pairs to direct completion interrupts, totally different
> and independent from RSS. It's only meaningful if rdma_capable
> is false.

Ok. I was not sure about whether they can co-exist, and decided to
make this extra comparison anyway.
Good that you shared this info. But I don't think this code hurts.

>
> +                               cmp_ret = cifs_ipaddr_cmp((struct sockaddr *) &a->sockaddr,
> +                                                      (struct sockaddr *) &b->sockaddr);
> +                               if (!cmp_ret)
> +                                       return 0;
> +                               else if (cmp_ret > 0)
> +                                       return 1;
>
> Again, I don't agree that the address family has anything to do
> with preferring an interface. This should return -1.

As explained above, this just decides the ordering of the interface list.
The items on the head of the list do get slight preference, the list
is first sorted based on speed, then on capabilities, then on family,
then on the actual value.

>
> +                               else
> +                                       return -1;
> +                       } else if (a->rss_capable > b->rss_capable)
> +                               return 1;
>
> It's good to prefer an RSS-capable interfgace over non-RSS, but
> again, only if the interface is not RDMA.
>
> +                       else
> +                               return -1;
> +               } else if (a->rdma_capable > b->rdma_capable)
> +                       return 1;
>
> And it's good to prefer RDMA over non-RDMA, but again, it's going
> to have very strange results if the client starts sending traffic
> over both interfaces for the same flow!

I understand your concern on this. And I can fix this.

>
> +               else
> +                       return -1;
> +       } else if (a->speed > b->speed)
> +               return 1;
> +       else
> +               return -1;
> +}
>
> And again, speeds are only one kind of tiebreaker. This code
> only looks at the RSS and RDMA attributes when the speeds
> match. The Windows client, for example, *always* prefers RDMA
> if available. But Windows has no explicit RDMA mode, it always
> starts with TCP. The Linux client is different, yet this code
> doesn't seem to check.
>
> Personally, I think that without an explicit -o rdma, the code
> should attempt to connect to any discovered RDMA server
> interfaces, via RDMA, and use them exclusively if they work.
> Otherwise, it should stay on TCP.

So even for this case, no mixing?

>
> OTOH, if -o rdma was specified, the client should ignore TCP,
> or at least, not silently fall back to TCP.

Understood.

>
> In other words, these tests are still too simplistic, and too
> likely to result in unexpected mixes of channels. Should we step
> back and write a flowchart?

I think I understood your concerns.
I'll address all these concerns with the next version of my patch and
then check if we're on the same page.

>
> Tom.



-- 
Regards,
Shyam



[Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux