Re: [PATCH v2] IB/ipoib: improve latency in ipoib/cm connection formation

Jason Gunthorpe <jgg@xxxxxxxx> · Mon, 19 Apr 2021 14:55:20 -0300

On Wed, Apr 14, 2021 at 10:01:43AM +0000, Haakon Bugge wrote:

> ... and, if you anticipate that the UD QP is using pkey1 at indexX,
> the pkey table table gets updates by the SM so the new entry in
> indexX becomes pkey2, the old pkey1 is now at a new position in the
> table (or not in the table is another case), let's say pkey1 is now
> found at indexY. Now, the connected mode QP will use pkey1 at indexY
> if a dedicated query is performed.

This is the concern.. The SM is really supposed to keep the pkey table
stable, I think if it changes it should trigger some heavy flush.

So just confirm that the heavy flush caused a new pkey index to be
loaded and the UD side gets resynced and we ar egodo

> Then we end up in a split brain, the UD QP uses pkey2 and the RC QPs
> use pkey1. With Manju's patch, they will at least use the same pkey.

Well as you pointed it goes throught he heavy flush and triggers
ipoib_pkey_dev_check_presence() which does update the pkey_index, so
it seems fine.

Applied to for-next

> Not related to this commit; I find it strange that the return value
> of update_child_pkey() is not used in __ipoib_ib_dev_flush().

The second callsite uses it

Jason