On Sat, May 16, 2020 at 03:24:01PM +0200, Johannes Berg wrote: > On Fri, 2020-05-15 at 21:28 +0000, Luis Chamberlain wrote:> module_firmware_crashed > > You didn't CC me or the wireless list on the rest of the patches, so I'm > replying to a random one, but ... > > What is the point here? > > This should in no way affect the integrity of the system/kernel, for > most devices anyway. Keyword you used here is "most device". And in the worst case, *who* knows what other odd things may happen afterwards. > So what if ath10k's firmware crashes? If there's a driver bug it will > not handle it right (and probably crash, WARN_ON, or something else), > but if the driver is working right then that will not affect the kernel > at all. Sometimes the device can go into a state which requires driver removal and addition to get things back up. > So maybe I can understand that maybe you want an easy way to discover - > per device - that the firmware crashed, but that still doesn't warrant a > complete kernel taint. That is one reason, another is that a taint helps support cases *fast* easily detect if the issue was a firmware crash, instead of scraping logs for driver specific ways to say the firmware has crashed. > Instead of the kernel taint, IMHO you should provide an annotation in > sysfs (or somewhere else) for the *struct device* that had its firmware > crash. It would seem the way some folks are thinking about getting more details would be through devlink. > Or maybe, if it's too complex to walk the entire hierarchy > checking for that, have a uevent, or add the ability for the kernel to > print out elsewhere in debugfs the list of devices that crashed at some > point... All of that is fine, but a kernel taint? debugfs is optional, a taint is simple, and device agnostic. From a support perspective it is very easy to see if a possible issue may be device firmware specific. Luis