On Fri, Jul 31, 2020 at 02:11:22PM -0700, Rustam Kovhaev wrote: > IB roce driver receives NETDEV_UNREGISTER event, calls dev_hold() and > schedules work item to execute, and before wq gets a chance to complete > it, we return to ip_tunnel.c:274 and call free_netdev(), and then later > we get UAF when scheduled function references already freed net_device > > i added verbose logging to ip_tunnel.c to see pcpu_refcnt: > + pr_info("about to free_netdev(dev) dev->pcpu_refcnt %d", netdev_refcnt_read(dev)); > > and got the following: > [ 410.220127][ T2944] ip_tunnel: about to free_netdev(dev) dev->pcpu_refcnt 8 I think there is a missing call to netdev_wait_allrefs() in the rollback_registered_many(). The normal success flow has this wait after delivering NETDEV_UNREGISTER, the error unwind for register_netdevice should as well. If the netdevice can progress to free while a dev_hold is active I think it means dev_hold is functionally useless. Jason