Re: [PATCH for-next 13/16] IB/{hfi1, ipoib, rdma}: Broadcast ping sent packets which exceeded mtu size

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2/21/2020 6:32 PM, Jason Gunthorpe wrote:
On Fri, Feb 21, 2020 at 02:40:28PM -0500, Dennis Dalessandro wrote:
On 2/18/2020 7:42 PM, Jason Gunthorpe wrote:
On Mon, Feb 10, 2020 at 08:19:44AM -0500, Dennis Dalessandro wrote:
From: Gary Leshner <Gary.S.Leshner@xxxxxxxxx>

When in connected mode ipoib sent broadcast pings which exceeded the mtu
size for broadcast addresses.

Add an mtu attribute to the rdma_netdev structure which ipoib sets to its
mcast mtu size.

The RDMA netdev uses this value to determine if the skb length is too long
for the mtu specified and if it is, drops the packet and logs an error
about the errant packet.

I'm confused by this comment, connected mode is not able to use
rdma_netdev, for various technical reason, I thought?

Is this somehow running a rdma_netdev concurrently with connected
mode? How?

No, not concurrently. When ipoib is in connected mode, a broadcast request,
something like:

ping -s 2017 -i 0.001 -c 10 -M do -I ib0 -b 192.168.0.255

will be sent down from user space to ipoib. At an mcast_mtu of 2048, the max
payload size is 2016 (2048 - 28 - 4). If AIP is not being used then the
datagram send function (ipoib_send()) does a check and drops the packet.

However when AIP is enabled ipoib_send is of course not used and we land in
rn->send function. Which needs to do the same check.

You just contradicted yourself: the first sentence was 'not
concurrently' and here you say we have connected mode turned on and
yet a packet is delivered to AIP, so what do you mean?

AIP provides a rdma_netdev (rn) that overloads the rn inside ipoib. When the broadcast skb is passed down from the user space, even in connected mode, the skb will be forwarded to the rn to send out.

What I mean is if you can do connected mode you don't have a
rdma_netdev and you can't do AIP.

The rdma_netdev is always present, regardless of the ipoib mode.

How are things in connected mode and a rdma_netdev is available?

So we don't only overload the rn for datagram, we do it for connected as well.

The rdma_netdev is set up when ipoib first finds the port, not when the mode is switched through sysfs. Therefore, it has to be there always, even in connected mode.

In hfi1_ipoib_setup_rn() (the setup function for rdma_netdev), we set:
    rn->send = hfi1_ipoib_send

We also keeps the default netdev_ops and overload it with our netdev_ops to set up /tear down resources during netdev init/uninit/open/close:

    Priv->netdev_ops = netdev->netdev_ops;
    Netdev->netdev_ops = &hfi1_ipoib_netdev_ops;

-Denny




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux