Re: [PATCH 1/3] bus: mhi: host: add mhi_power_down_no_destroy()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]


On 2/26/2024 8:15 PM, Manivannan Sadhasivam wrote:
On Wed, Feb 21, 2024 at 11:00:24AM +0800, Baochen Qiang wrote:
ath11k fails to resume:

ath11k_pci 0000:06:00.0: timeout while waiting for restart complete

This happens because when calling mhi_sync_power_up() the MHI subsystem
eventually calls device_add() from mhi_create_devices() but the device
creation is deferred:

mhi mhi0_IPCR: Driver qcom_mhi_qrtr force probe deferral

The reason for deferring device creation is explained in dpm_prepare():

          * It is unsafe if probing of devices will happen during suspend or
          * hibernation and system behavior will be unpredictable in this case.
          * So, let's prohibit device's probing here and defer their probes
          * instead. The normal behavior will be restored in dpm_complete().

Because the device probe is deferred, the qcom_mhi_qrtr_probe() is not
called and thus MHI channels are not prepared:

So what this means that QRTR is not delivering messages and the QMI connection
is not working between ath11k and the firmware, resulting a failure in firmware

To fix this add new function mhi_power_down_no_destroy() which doesn't destroy
the devices for channels during power down. This way we avoid probe defer issue
and finally can get ath11k hibernation working with the following patches.

Upto this line is the actual commit message and below should be moved to the
comments section of the patch.
I would like to remove below info since they are included in the cover letter already. And even keeping them in the comment section won't make them visible after patch got merged.

Actually there is an RFC version of this change and it gets positive results
from multiple users. Firstly Mani doesn't like this idea and insists that an
MHI device should be destroyed when going to suspend/hibernation, see

Then Mani changed his mind after a further discussion with kernel PM guys,

So we come up with the regular version and it is almost identical with that RFC

Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.30

Signed-off-by: Kalle Valo <quic_kvalo@xxxxxxxxxxx>
Signed-off-by: Baochen Qiang <quic_bqiang@xxxxxxxxxxx>
  drivers/bus/mhi/host/internal.h |  4 +++-
  drivers/bus/mhi/host/pm.c       | 36 +++++++++++++++++++++++++++------
  include/linux/mhi.h             | 15 +++++++++++++-
  3 files changed, 47 insertions(+), 8 deletions(-)

diff --git a/drivers/bus/mhi/host/internal.h b/drivers/bus/mhi/host/internal.h
index 091244cf17c6..8ce4aec56425 100644
--- a/drivers/bus/mhi/host/internal.h
+++ b/drivers/bus/mhi/host/internal.h
@@ -86,6 +86,7 @@ enum dev_st_transition {
@@ -96,7 +97,8 @@ enum dev_st_transition {
  	dev_st_trans(MISSION_MODE,	"MISSION MODE")		\
  	dev_st_trans(FP,		"FLASH PROGRAMMER")	\
  	dev_st_trans(SYS_ERR,		"SYS ERROR")		\
-	dev_st_trans_end(DISABLE,	"DISABLE")
+	dev_st_trans(DISABLE,		"DISABLE")		\
extern const char * const dev_state_tran_str[DEV_ST_TRANSITION_MAX];
  #define TO_DEV_STATE_TRANS_STR(state) (((state) >= DEV_ST_TRANSITION_MAX) ? \
diff --git a/drivers/bus/mhi/host/pm.c b/drivers/bus/mhi/host/pm.c
index 8b40d3f01acc..5686d32f7458 100644
--- a/drivers/bus/mhi/host/pm.c
+++ b/drivers/bus/mhi/host/pm.c
@@ -468,7 +468,8 @@ static int mhi_pm_mission_mode_transition(struct mhi_controller *mhi_cntrl)
/* Handle shutdown transitions */
-static void mhi_pm_disable_transition(struct mhi_controller *mhi_cntrl)
+static void mhi_pm_disable_transition(struct mhi_controller *mhi_cntrl,
+				      bool destroy_device)
  	enum mhi_pm_state cur_state;
  	struct mhi_event *mhi_event;
@@ -530,8 +531,10 @@ static void mhi_pm_disable_transition(struct mhi_controller *mhi_cntrl)
  	dev_dbg(dev, "Waiting for all pending threads to complete\n");
- dev_dbg(dev, "Reset all active channels and remove MHI devices\n");
-	device_for_each_child(&mhi_cntrl->mhi_dev->dev, NULL, mhi_destroy_device);

I'd be nice to add a comment here on why destroying the device is optional.

+	if (destroy_device) {
+		dev_dbg(dev, "Reset all active channels and remove MHI devices\n");
+		device_for_each_child(&mhi_cntrl->mhi_dev->dev, NULL, mhi_destroy_device);
+	}
mutex_lock(&mhi_cntrl->pm_mutex); @@ -821,7 +824,10 @@ void mhi_pm_st_worker(struct work_struct *work)
-			mhi_pm_disable_transition(mhi_cntrl);
+			mhi_pm_disable_transition(mhi_cntrl, false);
+			break;
+			mhi_pm_disable_transition(mhi_cntrl, true);
@@ -1175,7 +1181,8 @@ int mhi_async_power_up(struct mhi_controller *mhi_cntrl)
-void mhi_power_down(struct mhi_controller *mhi_cntrl, bool graceful)
+static void __mhi_power_down(struct mhi_controller *mhi_cntrl, bool graceful,
+			     bool destroy_device)
  	enum mhi_pm_state cur_state, transition_state;
  	struct device *dev = &mhi_cntrl->mhi_dev->dev;
@@ -1211,15 +1218,32 @@ void mhi_power_down(struct mhi_controller *mhi_cntrl, bool graceful)
- mhi_queue_state_transition(mhi_cntrl, DEV_ST_TRANSITION_DISABLE);
+	if (destroy_device)
+		mhi_queue_state_transition(mhi_cntrl,
+	else
+		mhi_queue_state_transition(mhi_cntrl,
/* Wait for shutdown to complete */
+void mhi_power_down(struct mhi_controller *mhi_cntrl, bool graceful)
+	__mhi_power_down(mhi_cntrl, graceful, true);
+void mhi_power_down_no_destroy(struct mhi_controller *mhi_cntrl,

How about "mhi_power_down_keep_dev"? Not the best of the API naming suggestion,
but it reflects what the API does.

+			       bool graceful)
+	__mhi_power_down(mhi_cntrl, graceful, false);
  int mhi_sync_power_up(struct mhi_controller *mhi_cntrl)
  	int ret = mhi_async_power_up(mhi_cntrl);
diff --git a/include/linux/mhi.h b/include/linux/mhi.h
index 474d32cb0520..39a6a944a52c 100644
--- a/include/linux/mhi.h
+++ b/include/linux/mhi.h
@@ -647,12 +647,25 @@ int mhi_async_power_up(struct mhi_controller *mhi_cntrl);
  int mhi_sync_power_up(struct mhi_controller *mhi_cntrl);
- * mhi_power_down - Start MHI power down sequence
+ * mhi_power_down - Start MHI power down sequence. See also

How about?

	 * mhi_power_down - Power down the MHI device and also destroy the
	 * 		    'struct device' for the channels associated with it.


	 * See also mhi_power_down_keep_dev() which is a variant of
	 * this API that keeps the 'struct device' for channels (useful during
	 * suspend/hibernation).

+ * mhi_power_down_no_destroy() which is a variant of this for
+ * suspend/hibernation.
+ *
   * @mhi_cntrl: MHI controller
   * @graceful: Link is still accessible, so do a graceful shutdown process
  void mhi_power_down(struct mhi_controller *mhi_cntrl, bool graceful);
+ * mhi_power_down_no_destroy - Start MHI power down sequence but don't destroy
+ * struct devices for channels. This is a variant for mhi_power_down() and
+ * would be useful in suspend/hibernation.
+ *

	 * mhi_power_down_keep_dev - Power down the MHI device but keep the
	 * 			     'struct device' for the channels
	 *			     associated with it.


	 * This is a variant of 'mhi_power_down' and useful in scenarios such as
	 * suspend/hibernation where destroying of the 'struct device' is not
	 * needed.

- Mani

+ * @mhi_cntrl: MHI controller
+ * @graceful: Link is still accessible, so do a graceful shutdown process
+ */
+void mhi_power_down_no_destroy(struct mhi_controller *mhi_cntrl, bool graceful);
   * mhi_unprepare_after_power_down - Free any allocated memory after power down
   * @mhi_cntrl: MHI controller

[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [Linux for Sparc]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux