On Mon, Jan 29, 2024 at 1:31 PM Manivannan Sadhasivam <mani@xxxxxxxxxx> wrote: > > On Mon, Jan 29, 2024 at 01:22:27PM +0100, Rafael J. Wysocki wrote: > > On Mon, Jan 29, 2024 at 11:10 AM Baochen Qiang <quic_bqiang@xxxxxxxxxxx> wrote: > > > > > > Hi Rafael and Pavel, > > > > > > Currently I am facing an ath11k (a kernel WLAN driver) resume issue > > > related with kernel PM framework and MHI module. > > > > > > Before introducing the issue details, I'd like to summarize how ath11k > > > interacts with MHI stack to download WLAN firmware to hardware target: > > > 1. when booting/restarting, ath11k powers on MHI module and waits for > > > MHI channels to be ready. > > > 2. When power on, MHI stack creates some virtual MHI devices, which > > > represents MHI hardware channels, and adds them to MHI bus. This > > > triggers MHI client driver, named QRTR, to get matched and probe those > > > MHI devices. In probe, QRTR initializes MHI channels and finally move > > > them to ready state. > > > 3. Once MHI channels ready, ath11k downloads WLAN firmware to hardware > > > target, then WLAN is working. > > > > > > Such an flow works well in general, but introduces issues in hibernation > > > cycle: when preparing for hibernation, ath11k powers down MHI, this > > > results in MHI devices being destroyed thus QRTR resets MHI channels. > > > When resuming back from hibernation, ath11k powers on MHI and waits for > > > MHI channels to be ready in its resume callback. As said above, MHI > > > creates and adds MHI devices to MHI bus, but they can't be probed at > > > that time because device probe is prohibited in device_block_probing(), > > > finally this results in ath11k resume timeout. > > > > > > Now there is an potential fix to this issue which would needs changes in > > > MHI stack, i.e., don't destroy MHI devices while hibernating. > > > > Exactly. > > > > During hibernation, the power to ath11k could be lost and in that case, there > will be no channels available from the device. So keeping the "struct dev" when > there is no real device attached to the system, goes against the driver model > IMO since we would be messing with the refcount. But this is system hibernation or suspend and the reason for the power loss is quite different from device removal at run time. The device is going to be back during resume (or at least it is not expected to go away in the meantime), so it is pointless to destroy its representation in memory. > For instance in the case of USB, if the device get's unplugged, would it make > sense to keep the "struct dev" for the device in kernel in a hope that it would > come back again? At run time - no, during system suspend - yes. It is not even recommended to free IRQs during system suspend. > The driver model as I understood is, once the actual physical device gets > removed, the refcount for "struct dev" should be decremented and it should be > destroyed. Not really. Thanks!