On 03/08, Stanislav Fomichev wrote: > On 03/08, Jakub Kicinski wrote: > > On Sat, 8 Mar 2025 13:18:13 -0800 Jakub Kicinski wrote: > > > On Sun, 9 Mar 2025 05:37:18 +0900 Kohei Enju wrote: > > > > Both netdev_lock() and netdev_lock_ops() are called before > > > > list_netdevice() in register_netdevice(). > > > > No other context can access the struct net_device, so we don't need these > > > > locks in this context. > > > > > > Doesn't sysfs get registered earlier? > > > I'm afraid not being able to take the lock from the registration > > > path ties our hands too much. Maybe we need to make a more serious > > > attempt at letting the caller take the lock? > > > > Looking closer at the report - we are violating the contract that only > > drivers which opted in get their ops called under the instance lock. > > iavf had a similar problem but it had to opt in. WiFi doesn't. > > > > Maybe we can bring the address semaphore back? > > We just need to take it before the ops lock in do_setlink. > > A bit ugly but would work? > > I remember I was having another lockdep circular report with the addr > sema, but maybe moving it before the ops lock fill fix it not sure. > > But coming back to "No other context can access the struct net_device, > so we don't need these locks in this context.". What if we move > netdev_set_addr_lockdep_class() call down a bit? Right before list_netdevice > happens. Will it help with the lockdep? Hmm, netdev_set_addr_lockdep_class is not touching instance lock :-( But basically do lockdep_set_novalidate_class early and undo it before list_netdevice...