On 05/25/2018 09:25 AM, Vladimir Zapolskiy wrote: >>>> For ages trivial changes to RAVB and SuperH ethernet links by means of >>>> standard 'ethtool' trigger a 'sleeping function called from invalid >>>> context' bug, to visualize it on r8a7795 ULCB: >>>> >>>> % ethtool -r eth0 >>>> BUG: sleeping function called from invalid context at kernel/locking/mutex.c:747 >>>> in_atomic(): 1, irqs_disabled(): 128, pid: 554, name: ethtool >>>> INFO: lockdep is turned off. >>>> irq event stamp: 0 >>>> hardirqs last enabled at (0): [<0000000000000000>] (null) >>>> hardirqs last disabled at (0): [<ffff0000080e1d3c>] copy_process.isra.7.part.8+0x2cc/0x1918 >>>> softirqs last enabled at (0): [<ffff0000080e1d3c>] copy_process.isra.7.part.8+0x2cc/0x1918 >>>> softirqs last disabled at (0): [<0000000000000000>] (null) >>>> CPU: 5 PID: 554 Comm: ethtool Not tainted 4.17.0-rc4-arm64-renesas+ #33 >>>> Hardware name: Renesas H3ULCB board based on r8a7795 ES2.0+ (DT) >>>> Call trace: >>>> dump_backtrace+0x0/0x198 >>>> show_stack+0x24/0x30 >>>> dump_stack+0xb8/0xf4 >>>> ___might_sleep+0x1c8/0x1f8 >>>> __might_sleep+0x58/0x90 >>>> __mutex_lock+0x50/0x890 >>>> mutex_lock_nested+0x3c/0x50 >>>> phy_start_aneg_priv+0x38/0x180 >>>> phy_start_aneg+0x24/0x30 >>>> ravb_nway_reset+0x3c/0x68 >>>> dev_ethtool+0x3dc/0x2338 >>>> dev_ioctl+0x19c/0x490 >>>> sock_do_ioctl+0xe0/0x238 >>>> sock_ioctl+0x254/0x460 >>>> do_vfs_ioctl+0xb0/0x918 >>>> ksys_ioctl+0x50/0x80 >>>> sys_ioctl+0x34/0x48 >>>> __sys_trace_return+0x0/0x4 >>>> >>>> The root cause is that an attempt to modify ECMR and GECMR registers >>>> only when RX/TX function is disabled was too overcomplicated in its >>>> original implementation, also processing of an optional Link Change >>>> interrupt added even more complexity, as a result the implementation >>>> was error prone. >>>> >>>> The new locking scheme is confirmed to be correct by dumping driver >>>> specific and generic PHY framework function calls with aid of ftrace >>>> while running more or less advanced tests. >>>> >>>> Please note that sh_eth patches from the series were built-tested only. >>>> >>>> On purpose I do not add Fixes tags, the reused PHY handlers were added >>>> way later than the fixed problems were firstly found in the drivers. >>> >>> I think you went one step too far with these fixes. On the first glance, >>> the real fixes are to remove grabbing/releasing the spinlock for the duration >>> of the phylib calls. Am I right? If so, making use of the new phylib APIs >>> would be a further enhancement, it's not needed for fixing the splats per se... >> >> Note that I hadn't looked at the patches #3/#6 at the time of writing this; >> those seem to be more complicated than the rest. > > Right, the simplistic approach of just removing the held spinlock does > not fit well into the overall lame locking model found in the driver. Yet you only try fixing it in the patches #3 and #6. I was talking about the patches #1 and #4 mostly (#2 and #5 turned out to be non-fixes). > The thing is that I would prefer to exhibit 'remove custom callbacks' > side of the changes as it is done now, and fixing severe 'invalid contex' > bugs is left as a valuable side effect. I may attempt to find enough > free time to follow your instructions, but frankly speaking I don't > see it beneficial to split a single good all-sufficient change into > three or more: removal of spinlocks, replacement of phy_start_aneg(), > then a non-functional clean-up. Yes, I would prefer these step-by-step changes. > Bikeshedding isn't my preference, This is not about bikeshedding. What you are trying to do clearly violates the 2 basic principles of the kernel development: "don't mix fixes and enhancements" and "do one thing per patch". > but a report about technical flaws related to the published changes > is appreciated, otherwise let me ask you to accept the changes as is, > secondary optimizations can be done on top of them. No, I'll certainly have to NAK patches #1/#3 in their current form. I'm yet to review patches #3/#6... anyway, if you lack the time to do things properly, I'll have to take this burden on my shoulders (giving you credits). Yet I'm basically is in the same situation as you -- I have to spend my copiuos free time on the large patch sets (like yours) and I'm still having some cleanups to sh_eth cooking here (which I'll most probably have to defer)... > -- > With best wishes, > Vladimir MBR, Sergei