Re: [PATCH 2/3] RDMA/rxe: Replace write_lock_bh with write_lock_irqsave in __rxe_drop_index

Bob Pearson <rpearsonhpe@xxxxxxxxx> · Fri, 11 Feb 2022 11:37:16 -0600

On 2/11/22 04:09, Guoqing Jiang wrote:
> 
> 
> On 2/10/22 11:49 PM, Bob Pearson wrote:
>> On 2/10/22 08:16, Zhu Yanjun wrote:
>>> On Thu, Feb 10, 2022 at 3:37 PM Guoqing Jiang<guoqing.jiang@xxxxxxxxx>  wrote:
>>>> Same as __rxe_add_index, the lock need to be fully IRQ safe, otherwise
>>>> below calltrace appears.
>>>>
>> I had the impression that NAPI ran on a soft IRQ and the rxe tasklets are also on soft IRQs. So at least in theory spin_lock_bh() should be sufficient. Can someone explain where the hard interrupt is coming from that we need to protect.
> 
> Since rxe is actually run on top of NIC,  could it comes from NIC if NIC driver doesn't switch to NAPI
> or from other hardware? But my knowledge about the domain is limited.
> 
>>   There are other race conditions in current rxe that may also be the cause of this. I am trying to get a patch series accepted to deal with those.
> 
> If possible, could you investigate why rxe after 5.15 kernel doesn't work as reported in cover letter? Thank you!
> 
> Guoqing

Guoqing,

It would help to know more about the test setup you are using. I.e. which NIC/driver.
I mostly test on head of tree and things seem to be working.
You could add something like

	if (in_irq())
		<print something once or twice>

to rxe_udp_encap_recv() to check if you are in a hard interrupt in the receive path.

Bob