> From: Leonid Bloch <leonidb@xxxxxxxxxxxxxx> > Sent: Thursday, June 3, 2021 5:35 AM > To: KY Srinivasan <kys@xxxxxxxxxxxxx>; Haiyang Zhang > <haiyangz@xxxxxxxxxxxxx>; Stephen Hemminger > <sthemmin@xxxxxxxxxxxxx>; Wei Liu <wei.liu@xxxxxxxxxx>; Dexuan Cui > <decui@xxxxxxxxxxxxx> > Cc: linux-hyperv@xxxxxxxxxxxxxxx; netdev@xxxxxxxxxxxxxxx > Subject: [BUG] hv_netvsc: Unbind exits before the VFs bound to it are > unregistered > > Hi, > > When I try to unbind a network interface from hv_netvsc and bind it to > uio_hv_generic, once in a while I get the following kernel panic (please > note the first two lines: it seems as uio_hv_generic is registered > before the VF bound to hv_netvsc is unregistered): > > [Jun 3 09:04] hv_vmbus: registering driver uio_hv_generic > [ +0.002215] hv_netvsc 5e089342-8a78-4b76-9729-25c81bd338fc eth2: VF > unregistering: eth5 > [ +1.088078] BUG: scheduling while atomic: swapper/8/0/0x00010003 > [ +0.000001] BUG: scheduling while atomic: swapper/3/0/0x00010003 > [ +0.000001] BUG: scheduling while atomic: swapper/6/0/0x00010003 > [ +0.000000] BUG: scheduling while atomic: swapper/7/0/0x00010003 > [ +0.000005] Modules linked in: > [ +0.000001] Modules linked in: > [ +0.000001] uio_hv_generic > [ +0.000000] Modules linked in: > [ +0.000000] Modules linked in: > [ +0.000001] uio_hv_generic uio > [ +0.000001] uio > [ +0.000000] uio_hv_generic > [ +0.000000] uio_hv_generic > ... > > I run kernel 5.10.27, unmodified, besides RT patch v36, on Azure Stack > Edge platform, software version 2105 (2.2.1606.3320). > > I perform the bind-unbind using the following script (please note the > comment inline): > > net_uuid="f8615163-df3e-46c5-913f-f2d2f965ed0e" > dev_uuid="$(basename "$(readlink "/sys/class/net/eth1/device")")" > modprobe uio_hv_generic > echo "${net_uuid}" > /sys/bus/vmbus/drivers/uio_hv_generic/new_id > printf "%s" "${dev_uuid}" > /sys/bus/vmbus/drivers/hv_netvsc/unbind > ### If I insert 'sleep 1' here - all works correctly > printf "%s" "${dev_uuid}" > /sys/bus/vmbus/drivers/uio_hv_generic/bind > > > Thanks, > Leonid. It would be great if you can test the mainline kernel, which I suspect also has the bug. It looks like netvsc_remove() -> netvsc_unregister_vf() does the unbinding work in a synchronous mannter. I don't know why the bug happens. Right now I don't have a DPDK setup to test this, but I think the bug can be worked around by unbinding the PCI VF device from the pci-hyperv driver before unbinding the netvsc device, and re-binding the VF device after binding the netvsc device to uio_hv_generic. Thanks, -- Dexuan