Re: [PATCH for-rc] Revert "RDMA/efa: Use API to get contiguous memory blocks aligned to device supported page size"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 21/01/2020 18:39, Saleem, Shiraz wrote:
>> Subject: Re: [PATCH for-rc] Revert "RDMA/efa: Use API to get contiguous
>> memory blocks aligned to device supported page size"
>>
>> On 20/01/2020 16:10, Gal Pressman wrote:
>>> The cited commit leads to register MR failures and random hangs when
>>> running different MPI applications. The exact root cause for the issue
>>> is still not clear, this revert brings us back to a stable state.
>>>
>>> This reverts commit 40ddb3f020834f9afb7aab31385994811f4db259.
>>>
>>> Fixes: 40ddb3f02083 ("RDMA/efa: Use API to get contiguous memory
>>> blocks aligned to device supported page size")
>>> Cc: Shiraz Saleem <shiraz.saleem@xxxxxxxxx>
>>> Cc: stable@xxxxxxxxxxxxxxx # 5.3
>>> Signed-off-by: Gal Pressman <galpress@xxxxxxxxxx>
>>
>> Shiraz, I think I found the root cause here.
>> I'm noticing a register MR of size 32k, which is constructed from two sges, the first
>> sge of size 12k and the second of 20k.
>>
>> ib_umem_find_best_pgsz returns page shift 13 in the following way:
>>
>> 0x103dcb2000      0x103dcb5000       0x103dd5d000           0x103dd62000
>>           +----------+                      +------------------+
>>           |          |                      |                  |
>>           |  12k     |                      |     20k          |
>>           +----------+                      +------------------+
>>
>>           +------+------+                 +------+------+------+
>>           |      |      |                 |      |      |      |
>>           | 8k   | 8k   |                 | 8k   | 8k   | 8k   |
>>           +------+------+                 +------+------+------+
>> 0x103dcb2000       0x103dcb6000   0x103dd5c000              0x103dd62000
>>
>>
> 
> Gal - would be useful to know the IOVA (virt) and umem->addr also for this MR in ib_umem_find_best_pgsz

I'll update my debug prints to include the iova and rerun the tests.



[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux