On Fri, Nov 27, 2020 at 8:15 PM Jakub Kicinski <kuba@xxxxxxxxxx> wrote: > > On Sat, 28 Nov 2020 03:41:06 +0200 Vladimir Oltean wrote: > > Jakub, I would like to hear more from you. I would still like to try > > this patch out. You clearly have a lot more background with the code. > > Well, I've seen people run into the problem of this NDO not being able > to sleep, but I don't have much background or knowledge of what impact > the locking will have on real systems. > > We will need to bring this up with Eric (probably best after the turkey > weekend is over). > > In the meantime if you feel like it you may want to add some tracing / > printing to check which processes are accessing /proc/net/dev on your > platforms of interest, see if there is anything surprising. > > > You said in an earlier reply that you should have also documented that > > ndo_get_stats64 is one of the few NDOs that does not take the RTNL. Is > > there a particular reason for that being so, and a reason why it can't > > change? > > I just meant that as a way of documenting the status quo. I'm not aware > of any other place reading stats under RCU (which doesn't mean it > doesn't exist :)). > > That said it is a little tempting to add a new per-netdev mutex here, > instead of congesting RTNL lock further, since today no correct driver > should depend on the RTNL lock. Another possible option could be replacing for_each_netdev_rcu with for_each_netdev_srcu and using list_for_each_entry_srcu (though it's currently used nowhere else in the kernel). Has anyone considered using sleepable RCUs or thought of a reason they wouldn't work or wouldn't be desirable? For more info search for SRCU in Documentation/RCU/RTFP.txt