On Tue, Mar 13, 2018 at 5:01 PM, Alex Vesker <valex@xxxxxxxxxxxx> wrote: > > > On 3/12/2018 5:39 PM, Jason Gunthorpe wrote: >> >> On Sun, Mar 11, 2018 at 01:28:23PM +0200, Leon Romanovsky wrote: >>> >>> From: Shalom Lagziel <shaloml@xxxxxxxxxxxx> >>> >>> In IPoIB connected mode the WQE opcode can be IB_WR_SEND only (stateless >>> offloads are supported only in datagram mode), but it wasn't initialized. >>> It leads to unpredictable CQE errors which were triggered by >>> ipoib_cm_send() flow. >> >> This description is confusing. I think it is trying to say 'when >> switching from UD to RC mode the wr.opcode can be left at IB_WR_LSO >> ..' >> >> Sounds like -rc and -stable material, no? >> >>> In order to fix it, we initialize the wqe opcode in the set mode function >>> to be IB_WR_SEND. >>> >>> Reviewed-by: Erez Shitrit <erezsh@xxxxxxxxxxxx> >>> Signed-off-by: Shalom Lagziel <shaloml@xxxxxxxxxxxx> >>> Signed-off-by: Leon Romanovsky <leon@xxxxxxxxxx> >>> drivers/infiniband/ulp/ipoib/ipoib_main.c | 1 + >>> 1 file changed, 1 insertion(+) >>> >>> diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c >>> b/drivers/infiniband/ulp/ipoib/ipoib_main.c >>> index 161ba8c76285..8f4ac2765012 100644 >>> +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c >>> @@ -539,6 +539,7 @@ int ipoib_set_mode(struct net_device *dev, const char >>> *buf) >>> dev_set_mtu(dev, ipoib_cm_max_mtu(dev)); >>> rtnl_unlock(); >>> priv->tx_wr.wr.send_flags &= ~IB_SEND_IP_CSUM; >>> + priv->tx_wr.wr.opcode = IB_WR_SEND; >> >> And shouldn't this be before the >> >> set_bit(IPOIB_FLAG_ADMIN_CM, &priv->flags); >> >> ? >> >> And how does this patch avoid races with post_send() in ipoib_ib.c ? >> That should be explained in a comment or patch description.. >> >> I think this needs to halt the queues and take the device to down >> before changing the flag to avoid the races... >> >> Jason > > > You are right there might be races with post_send and the opcode even with > the fix, > we can set the opcode every time when sending in connected mode which will > take > care of the opcode race. > There is only one send in parallel when working in non-enhanced mode and > enhanced > mode is datagram only. > > Speaking on higher level it might help the stability to halt queue before > the switch. > Let`s look into it. > Perhaps just to take that line ("priv->tx_wr.wr.opcode = IB_WR_SEND;") few lines below, after the call for ipoib_flush_paths(dev) will solve the race. (Because after the call for ipoib_flush_paths() we can be sure that no packets from LSO type will be sent) > >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in >> the body of a message to majordomo@xxxxxxxxxxxxxxx >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html