RE: [PATCH] rhashtable: Fix potential deadlock by moving schedule_work outside lock

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: Breno Leitao <leitao@xxxxxxxxxx> Sent: Thursday, January 2, 2025 2:16 AM
> 
> On Sat, Dec 21, 2024 at 05:06:55PM +0800, Herbert Xu wrote:
> > On Thu, Dec 12, 2024 at 08:33:31PM +0800, Herbert Xu wrote:
> > >
> > > The growth check should stay with the atomic_inc.  Something like
> > > this should work:
> >
> > OK I've applied your patch with the atomic_inc move.
> 
> Sorry, I was on vacation, and I am back now. Let me know if you need
> anything further.
> 
> Thanks for fixing it,
> --breno

Breno and Herbert --

This patch seems to break things in linux-next. I'm testing with
linux-next20250108 in a VM in the Azure public cloud. The Mellanox mlx5
ethernet NIC in the VM is failing to get setup.

I bisected to commit e1d3422c95f0 ("rhashtable: Fix potential deadlock
by moving schedule_work outside lock"), then debugged why opening
the mlx5 NIC device is failing. The failure is in the XDP code in function
__xdp_reg_mem_model() where the call to rhashtable_insert_slow()
is returning -E2BIG. The problem does not occur when the commit
is reverted.

The function call stack is this:

dev_open()
__dev_open()
mlx5e_open()
mlx5e_open_locked()
mlx5e_open_channels()
mlx5e_open_channel()
mlx5e_open_queues()
mlx5e_open_rxq_rq()
mlx5e_open_rq()
mlx5e_alloc_rq()
xdp_rxq_info_reg_mem_model()
__xdp_reg_mem_model()
rhashtable_insert_slow()

I have not debugged further as I don't know anything about the
rhashtable code or the XDP code. The only repro I have is a VM
in Azure. I thought I'd ask you (Breno and Herbert) to review
the patch again and see if there's a path that could cause the
hash table to be incorrectly detected as full.

I've included the linux-hyperv mailing list and the mlx5 driver
maintainers on this email. Someone involved with Azure/Hyper-V
or the mlx5 driver may have seen the problem, and I want to try
to avoid duplicative debugging.

Let me know if there's something I can do to help debug further.

Thanks,

Michael Kelley





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux