On Thu, 5 Oct 2006, Andrew Morton wrote: > > 2. The driver uses schedule_work() for handling interrupts, but does not > > make sure any pending work scheduled thus has been completed before > > driver's structures get freed from memory. This is especially > > important as interrupts may keep arriving if the line is shared with > > another PHY. > > > > The solution is to ignore phy_interrupt() calls if the reported device > > has already been halted and calling flush_scheduled_work() from > > phy_stop_interrupts() (but guarded with current_is_keventd() in case > > the function has been called through keventd from the MAC device's > > close call to avoid a deadlock on the netlink lock). > > > > eww, hack. > > Also not module-friendly: > > WARNING: "current_is_keventd" [drivers/net/phy/libphy.ko] undefined! > > Does this > > static void flush_cpu_workqueue(struct cpu_workqueue_struct *cwq) > { > if (cwq->thread == current) { > /* > * Probably keventd trying to flush its own queue. So simply run > * it by hand rather than deadlocking. > */ > run_workqueue(cwq); > > not work??? It does, too well even! -- in the case I am trying to take care of we are run with "rtnl_mutex" held as a result of rtnl_lock() called from unregister_netdev() and there is some work already in the workqueue (linkwatch_event(), apparently) wanting to acquire the same lock. Hence a deadlock. It seems problematic elsewhere as well -- see tg3.c -- but a different way to deal with it has been found there. I am not that fond of this solution, but at least it works, unlike calling flush_scheduled_work() here, which deadlocks the current CPU in my system instantly. Any suggestions as to how to do this differently? Perhaps linkwatch_event() should be scheduled differently (using a separate workqueue?)... Or does dev_close() have to be called under this lock from unregister_netdevice()? Perhaps code like this would do: while (dev->flags & IFF_UP) { rtnl_unlock(); dev_close(dev); rtnl_lock(); } Maciej