Search Linux Wireless

Re: [PATCH RFC v2 1/8] bus: mhi: host: add mhi_power_down_no_destroy()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 12/21/2023 12:51 AM, Manivannan Sadhasivam wrote:
On Wed, Dec 20, 2023 at 10:02:25PM +0530, Manivannan Sadhasivam wrote:
On Tue, Dec 05, 2023 at 02:29:33PM +0200, Kalle Valo wrote:
Manivannan Sadhasivam <mani@xxxxxxxxxx> writes:

On Mon, Nov 27, 2023 at 06:20:15PM +0200, Kalle Valo wrote:

From: Baochen Qiang <quic_bqiang@xxxxxxxxxxx>

If ath11k tries to call mhi_power_up() during resume it fails:

ath11k_pci 0000:06:00.0: timeout while waiting for restart complete

This happens because when calling mhi_power_up() the MHI subsystem eventually
calls device_add() from mhi_create_devices() but the device creation is
deferred:

mhi mhi0_IPCR: Driver qcom_mhi_qrtr force probe deferral

The reason for deferring device creation is explained in dpm_prepare():

	/*
	 * It is unsafe if probing of devices will happen during suspend or
	 * hibernation and system behavior will be unpredictable in this case.
	 * So, let's prohibit device's probing here and defer their probes
	 * instead. The normal behavior will be restored in dpm_complete().
	 */

Because the device probe is deferred, the qcom_mhi_qrtr_probe() is not called and
qcom_mhi_qrtr_dl_callback() fails silently as qdev is zero:

static void qcom_mhi_qrtr_dl_callback(struct mhi_device *mhi_dev,
				      struct mhi_result *mhi_res)
{
	struct qrtr_mhi_dev *qdev = dev_get_drvdata(&mhi_dev->dev);
	int rc;

	if (!qdev || mhi_res->transaction_status)
		return;

So what this means that QRTR is not delivering messages and the QMI connection
is not working between ath11k and the firmware, resulting a failure in firmware
initialisation.

To fix this add new function mhi_power_down_no_destroy() which does not destroy
the devices during power down. This way mhi_power_up() can be called during
resume and we can get ath11k hibernation working with the following patches.

Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.30

Signed-off-by: Baochen Qiang <quic_bqiang@xxxxxxxxxxx>
Signed-off-by: Kalle Valo <quic_kvalo@xxxxxxxxxxx>

Any reason for reposting this series without discussing the suggestion from
Mayank?

Baochen quickly sent me fixes for the v1 review comments, as I have been
out of office for some time I didn't want to sit on Baochen's fixes for
too long. Better to get them out of the door as soon as possible. I will
definitely look at Mayank's proposal but that will take longer.

As I said in the internal thread, this patch breaks the Linux device
driver model by not destroying the "struct device" when the actual
device gets removed.

This patchset has been tested by several people, I'm even using this
patchset on main laptop every day, and we haven't noticed any issues.

Can you elaborate more about this driver model? We are not removing any
ath11k devices, we just want to power down the ath11k (and in the future
ath12k) devices for suspend and power up during resume.


Devices (struct dev) for each channels are created once the device (WLAN) enters
runtime mode such as (MISSION, SBL etc...). During hibernation, ath11k stack
calls mhi_power_down() which essentially resets the device to POR and also the
stack powers down the device properly.

In that case, MHI channels do not exist as the device (WLAN) itself is powered
down. As per kernel driver model, each struct device is tied to its reference
count. And the reference count should be decremented whenever the actual device
is not in use. Once the actual device is removed from the system, then the
respective struct device has to be destroyed altogether.

So in this case, even though the channels are not active (present) in the
device, the device itself gets powered off, you want MHI stack to keep the
struct device active, which is against the model I referenced above.

To fix this issue properly, we need to investigate on how other subsystems are
handling this situation (device getting powered down during hibernation), like
USB.


To me it all sounds like the probe deferral is not handled properly in mac80211
stack. As you mentioned in the commit message that the dpm_prepare() blocks
probing of devices. It gets unblocked and trigerred in dpm_complete():
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/base/power/main.c#n1131

So if mac80211/ath11k cannot probe the devices at the dpm_complete() stage, then
it is definitely an issue that needs to be fixed properly.
To clarify, ath11k CAN probe the devices at dpm_complete() stage. The problem is kernel does not wait for all probes to finish, and in that way we will face the issue that user space applications are likely to fail because they get thawed BEFORE WLAN is ready.


- Mani

- Mani

We should try to explore alternate options instead of persisting with
this solution.

What other options we have here? At least Baochen is not optimistic that
using PM_POST_HIBERNATION as a workaround would work. The issue we have
here is that mhi_power_up() doesn't work in the resume handler and
that's what we should try to fix, not make workarounds.

--
https://patchwork.kernel.org/project/linux-wireless/list/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

--
மணிவண்ணன் சதாசிவம்





[Index of Archives]     [Linux Host AP]     [ATH6KL]     [Linux Wireless Personal Area Network]     [Linux Bluetooth]     [Wireless Regulations]     [Linux Netdev]     [Kernel Newbies]     [Linux Kernel]     [IDE]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite Hiking]     [MIPS Linux]     [ARM Linux]     [Linux RAID]

  Powered by Linux