Re: [PATCH for-rc] IB/cma: Fix false P_Key mismatch messages

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On 10 May 2021, at 21:12, Jason Gunthorpe <jgg@xxxxxxxxxx> wrote:
> 
> On Mon, May 10, 2021 at 06:52:54PM +0000, Haakon Bugge wrote:
>> 
>> 
>>> On 10 May 2021, at 19:04, Jason Gunthorpe <jgg@xxxxxxxxxx> wrote:
>>> 
>>> On Wed, May 05, 2021 at 02:54:01PM +0200, Håkon Bugge wrote:
>>>> There are three conditions that must be fulfilled in order to consider
>>>> a partition match. Those are:
>>>> 
>>>>     1. Both P_Keys must valid
>>>>     2. At least one must be a full member
>>>>     3. The partitions (lower 15 bits) must match
>>>> 
>>>> In system employing both limited and full membership ports, we see
>>>> these false warning messages:
>>>> 
>>>> RDMA CMA: got different BTH P_Key (0x2a00) and primary path P_Key (0xaa00)
>>>> RDMA CMA: in the future this may cause the request to be dropped
>>>> 
>>>> even though the partition is the same.
>>>> 
>>>> See IBTA 10.9.1.2 Special P_Keys and 10.9.3 Partition Key Matching for
>>>> a reference.
>>>> 
>>>> Fixes: 84424a7fc793 ("IB/cma: Print warning on different inner and header P_Keys")
>>>> Signed-off-by: Håkon Bugge <haakon.bugge@xxxxxxxxxx>
>>>> drivers/infiniband/core/cma.c | 22 ++++++++++++++++++++--
>>>> 1 file changed, 20 insertions(+), 2 deletions(-)
>>> 
>>> What is this trying to fix?
>> 
>> The false warning messages. The wrong way though:-)
>> 
>>> IMHO it is a bug on the sender side to send GMPs to use a pkey that
>>> doesn't exactly match the data path pkey.
>> 
>> The active connector calls ib_addr_get_pkey(). This function
>> extracts the pkey from byte 8/9 in the device's bcast
>> address. However, RFC 4391 explicitly states:
> 
> pkeys in CM come only from path records that the SM returns, the above
> should only be used to feed into a path record query which could then
> return back a limited pkey.
> 
> Everything thereafter should use the SM's version of the pkey.

Revisiting this. I think I mis-interpreted the scenario that led to the P_Key mismatch messages.

The CM retrieves the pkey_index that matched the P_Key in the BTH (cm_get_bth_pkey()) and thereafter calls ib_get_cached_pkey() to get the P_Key value of the particular pkey_index.

Assume a full-member sends a REQ. In that case, both P_Keys (BTH and primary path_rec) are full. Further, assume the recipient is only a limited member. Since full and limited members of the same partition are eligible to communicate, the P_Key retrieved by cm_get_bth_pkey() will be the limited one.

The CMA will then give a warning message, because the P_Key in the primary path and the one incorrectly assumed to be in the BTH, doesn't match.

The point is that cm_get_bth_pkey() may return the limited member, even though the packet on the wire had the full-member partition in it's BTH P_Key.

So, I think my first commit here isn't that bad after all :-)


Thxs, Håkon





[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux