On Mon, Mar 25, 2019 at 3:14 PM Brian Norris <briannorris@xxxxxxxxxxxx> wrote: > On Mon, Mar 25, 2019 at 2:20 PM Michał Kazior <kazikcz@xxxxxxxxx> wrote: > > For one, you'll need to defer ath10k_pci_fw_crashed_dump to a worker. > > Maybe into ar->restart_work which the dump function calls now. > > Hmm, that's an idea -- although I'm not sure if I'd steal > 'restart_work', or create a different work item on the same queue. But > either way, we'd still also have to avoid holding 'data_lock', and at > that point, I'm not sure if we're losing desirable properties of these > firmware dumps -- it allows more "stuff" to keep going on while we're > preparing to dump the device memory state. So IIUC, we don't today have, for instance, a complete guarantee that Copy Engines are stopped at this point, so these memory dumps are useful mostly by the implied fact that a crashed firmware is no longer doing anything active. So as long as we ensure the driver is retaining exclusion on its own resources (e.g., coredump buffers) while performing the dump work, then we should be OK. > > To get rid of data_lock from ath10k_pci_fw_crashed_dump() you'll need > > to at least make fw_crash_counter into an atomic_t. This is just from > > a quick glance. > > Yes, we'd need at least that much. We'd also need some other form of > locking to ensure exclusion between all users of > ar->coredump.fw_crash_data and similar. At the moment, that's > 'data_lock', but I suppose we get a similar exclusion if all the > dump/restart work is on the same workqueue. I've cooked up a solution with a new dump_{work,mutex} to protect the coredump buffers and do the dumping work. I still keep the 'data_lock' only around the fw_crash_counter. > I'm still not quite sure if this is 5.1-rc material, or I should just > revert for 5.1. I'll post the above soon with a goal of 5.1. If that's not considered good enough, i'll post a revert too. Brian