Re: [PATCH rdma-next] IB/ipoib: Fix wqe initialized param on ipoib set mode to connected

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Mar 14, 2018 at 10:53 PM, Jason Gunthorpe <jgg@xxxxxxxxxxxx> wrote:
> On Wed, Mar 14, 2018 at 10:35:15PM +0200, Erez Shitrit wrote:
>> On Wed, Mar 14, 2018 at 8:09 PM, Jason Gunthorpe <jgg@xxxxxxxxxxxx> wrote:
>> > On Wed, Mar 14, 2018 at 03:22:59PM +0200, Erez Shitrit wrote:
>> >
>> >> Perhaps just to take that line ("priv->tx_wr.wr.opcode = IB_WR_SEND;")
>> >> few lines below, after the call for ipoib_flush_paths(dev) will solve
>> >> the race.
>> >>
>> >> (Because after the call for ipoib_flush_paths() we can be sure that no
>> >> packets from LSO type will be sent)
>> >
>> > But we've already enabled CM mode so we can't be sure a CM packet
>> > wasn't sent using the wrong wr opcode, so we are back to having the
>> > original race.
>>
>> What makes packet to be sent via CM is if it has neigh from CM
>> connection.
>
> neighs can be created at any time. Once IPOIB_FLAG_ADMIN_CM is set a
> new neigh could potentially use the CM path. I didn't notice any
> locking preventing path_rec_completion() from running concurrently
> with mode change.
>
> So the instant the switch code does the set_bit(IPOIB_FLAG_ADMIN_CM)
> we can start txing packets down the cm tx path, and until
> ipoib_flush_paths() complets *both* paths must be considered active.

Agree.

>
>> so, till ipoib_flush_paths() all packets will be sent UD, after that
>> new connections requests will be sent via CM.
>
> This remark only applies to existing neighbours in the neigh database,
> not to new neighs.
>
> Jason

The origin issue was that after changing to CM mode traffic might
stopped for very long time (depends of the arp time, at least 30 sec).
Now, if we move the line after the ipoib_flush_paths() call, the
problem is much smaller:
    only while ipoib_flush_paths() runs, packet that sent to CM
connection after packet from UD/GSO will be dropped.

The question is does this something that we really need to handle?
the error flow in CM mode already does this (and for other more often
error flows like RNR etc.)
and also we are talking about a case that is unlikely in the real life
of ipoib driver, mode changing is something that done once at the
beginning. should we add a code for this rare case?

Erez
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux