On Fri, 2020-11-20 at 12:53 -0800, Brian Norris wrote: > (Sorry if anything's a bit slow here. I don't really have time to > write out full proposals myself.) Don’t worry, I (and other owners) am already glad to hear from upstream devs, including you :) (and I'm also sorry for the late reply from me. It was difficult to find spare time these days.) > On Fri, Oct 30, 2020 at 3:30 AM Tsuchiya Yuto <kitakar@xxxxxxxxx> wrote: > > Let me know if splitting this patch like this works. 1) The first patch > > is to add this module parameter but don't change the default behavior. > > That *could* be OK with me, although it's already been said that there > are many people who dislike extra module parameters. I also don't see > why this needs to be a module parameter at all. If you do #2 right, > you don't really need this, as there are already several standard ways > of doing this (e.g., via Kconfig, or via nl80211 on a per-device > basis). > > > 2) The second patch is to change the parameter value depending on the > > DMI matching or something so that it doesn't break the existing users. > > Point 2 sounds good, and this is the key point. Note that you can do > point 2 without making it a module parameter. Just keep a flag in the > driver-private structures. I chose to also provide the module parameter because I thought it would be a little bit complicated for users to re-enable this dump feature if I chose not to provide this parameter. If I don't provide the parameter but still want to allow users to re-enable this dump feature, we can provide a way to change the bit flags of quirks (via a new debugfs entry or another module parameter). That said, is there a way to easily change the quirk flags only for this dump feature? For example, assume that the following three quirks are enabled by default: /* quirks */ #define QUIRK_FW_RST_D3COLD BIT(0) #define QUIRK_NO_DEVICE_DUMP BIT(1) #define QUIRK_FOO BIT(2) .driver_data = (void *)(QUIRK_FW_RST_D3COLD | QUIRK_NO_DEVICE_DUMP | QUIRK_FOO), card->quirks = (uintptr_t)dmi_id->driver_data; /* and assume that card->quirks can be changed by users by a file named * "quirks" under debugfs. */ So, the card->quirks will be "7" in decimal by default. Then, if users want to re-enable the dump feature, as far as I know, users need to know the default value "7", then need to know that device_dump is controlled by bit 1. Finally, users can set the new quirk flags "5" in decimal (101 in binary). echo 5 > /sys/kernel/debug/mwifiex/mlan0/quirks I'm glad if there is another nice way to control only the device_dump quirk flag, if we only provide a way to change quirk flags. But at the same time I also think that if someone dare to want to re-enable this feature, maybe the person won't feel it's complicated haha. So, I'll drop the device_dump module parameter and switch to use the quirk framework, adding a way to change the quirk flags value by users. That said, we may even drop disabling the dump. Take a look at my comment I wrote below. > > But what I want to say here as well is that, if the firmware can be fixed, > > we don't need a patch like this. > > Sure. That's also where we don't necessarily need more ways to control > this from user space (e.g., module parameters), but just better > detection of currently broken systems (in the driver). And if firmware > ever gets fixed, we can undo the "broken device" detection. There are two types of firmware crashes we observes: 1) cmd_timedout (other than ps_mode-related) 2) Firmware wakeup failed (ps_mode-related) The #2 is observed when we enabled ps_mode. The #1 is observed for the other causes. And hopefully, verdre (in Cc) found a "fix" [1] for the #1 fw crash. We are trying the fix now. The pull req (still WIP) basically addresses fw crashing by using "non-posted" register writes and uninterruptible wait queue for PCI operations in the kernel driver side. We still don't have any ideas to "fix" the #2 fw crash, but now we don't see the #1 crash anymore so far. I'd like to see where the attempt goes before I start working on this patch here again. Thank you, everyone. [1] https://github.com/linux-surface/kernel/pull/70 [WIP] Properly fix wifi firmware crashes by jonas2515 · Pull Request #70 · linux-surface/kernel > Brian