Re: [PATCH] mmc: dw_mmc-exynos: fix potential external abort in resume_noirq()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Marek,

On 12 June 2018 at 10:28, Marek Szyprowski <m.szyprowski@xxxxxxxxxxx> wrote:
> Hi Ulf,
>
> On 2018-06-11 14:24, Ulf Hansson wrote:
>> On 11 June 2018 at 11:50, Marek Szyprowski <m.szyprowski@xxxxxxxxxxx> wrote:
>>> On 2018-06-11 11:35, Ulf Hansson wrote:
>>>> On 11 June 2018 at 08:48, Marek Szyprowski <m.szyprowski@xxxxxxxxxxx> wrote:
>>>>> dw_mci_exynos_resume_noirq() performs DWMMC register access without
>>>>> ensuring that respective clocks are enabled. This might cause external
>>>>> abort on some systems (observed on Exynos5433 based boards). Fix this
>>>>> by adding needed prepare_enable/disable_unprepare calls.
>>>>>
>>>>> Signed-off-by: Marek Szyprowski <m.szyprowski@xxxxxxxxxxx>
>>>>> ---
>>>>>    drivers/mmc/host/dw_mmc-exynos.c | 6 ++++++
>>>>>    1 file changed, 6 insertions(+)
>>>>>
>>>>> diff --git a/drivers/mmc/host/dw_mmc-exynos.c b/drivers/mmc/host/dw_mmc-exynos.c
>>>>> index 3164681108ae..6125b68726b0 100644
>>>>> --- a/drivers/mmc/host/dw_mmc-exynos.c
>>>>> +++ b/drivers/mmc/host/dw_mmc-exynos.c
>>>>> @@ -193,6 +193,9 @@ static int dw_mci_exynos_resume_noirq(struct device *dev)
>>>>>           struct dw_mci_exynos_priv_data *priv = host->priv;
>>>>>           u32 clksel;
>>>>>
>>>>> +       clk_prepare_enable(host->biu_clk);
>>>>> +       clk_prepare_enable(host->ciu_clk);
>>>>> +
>>>>>           if (priv->ctrl_type == DW_MCI_TYPE_EXYNOS7 ||
>>>>>                   priv->ctrl_type == DW_MCI_TYPE_EXYNOS7_SMU)
>>>>>                   clksel = mci_readl(host, CLKSEL64);
>>>>> @@ -207,6 +210,9 @@ static int dw_mci_exynos_resume_noirq(struct device *dev)
>>>>>                           mci_writel(host, CLKSEL, clksel);
>>>>>           }
>>>>>
>>>>> +       clk_disable_unprepare(host->biu_clk);
>>>>> +       clk_disable_unprepare(host->ciu_clk);
>>>>> +
>>>>>           return 0;
>>>>>    }
>>>>>    #else
>>>> I looked a little closer and I am wondering if it wouldn't be possible
>>>> to use SET_NOIRQ_SYSTEM_SLEEP_PM_OPS() instead of
>>>> SET_SYSTEM_SLEEP_PM_OPS()?
>>>>
>>>> Somelike this:
>>>> SET_NOIRQ_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend,
>>>> dw_mci_exynos_resume_noirq)
>>>>
>>>> Then from dw_mci_exynos_resume_noirq() call pm_runtime_force_resume().
>>>>
>>>> I think it would simplify the code a bit, as you can rely on the
>>>> runtime PM callbacks to deal with clk_prepare_enable() and
>>>> clk_disable_unprepare(), unless I am mistaken.
>>> This will not fix the problem, because mci_writel() calls in
>>> dw_mci_exynos_resume_noirq are done unconditionally, regardless of the
>>> controller's runtime pm state. Since commit 1d9174fbc55e after calling
>>> pm_runtime_force_resume() there is no guarantee that device is in
>>> runtime active state if it was runtime suspended state.
>> Yes, because the runtime PM usage count is greater than 1.
>> (pm_runtime_get_noresume() is called during probe).
>>
>> If you want to make this explicit (not relying on ->probe()), one can
>> add a ->suspend_noirq() callback and call pm_runtime_get_noresume() in
>> it.
>
> Sorry, but I don't get how this would work. Exactly the same pattern as
> you have proposed was already used in s3c-64xx SPI driver and it didn't
> work properly (tested on the same SoC as this DW-MMC change). I had to
> move register access to runtime resume callback to fix external abort
> issue:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e935dba111621bd6a0c5d48e6511a4d9885103b4

Yep, that is a correct solution.

>
> Here in DW-MMC driver such approach (moving all the code to runtime
> resume callback) is not possible because of the potential interrupt storm
> caused by the hw bug (that's the reason of using noirq resume callback).

I understand. What you need is to run the runtime resume/suspend
callbacks in the resume/suspend noirq phase. Moreover, you need to
make sure that the runtime resume callback, really becomes invoked
during the resume noirq phase, because of the HW bug.

I think the below should work. Can you give it a try?

It relies on the call pm_runtime_get_noresume(), done during
->probe(). Note that, the driver always keeps the RPM usage count
increased, thus preventing runtime suspend during normal execution.

Anyway, if this doesn't work, your suggested approach works fine as well.

---
 drivers/mmc/host/dw_mmc-exynos.c | 15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/drivers/mmc/host/dw_mmc-exynos.c b/drivers/mmc/host/dw_mmc-exynos.c
index a84aa3f1ae85..66132f7fceed 100644
--- a/drivers/mmc/host/dw_mmc-exynos.c
+++ b/drivers/mmc/host/dw_mmc-exynos.c
@@ -175,7 +175,9 @@ static int dw_mci_exynos_runtime_resume(struct device *dev)

        return ret;
 }
+#endif /* CONFIG_PM */

+#ifdef CONFIG_PM_SLEEP
 /**
  * dw_mci_exynos_resume_noirq - Exynos-specific resume code
  *
@@ -193,6 +195,8 @@ static int dw_mci_exynos_resume_noirq(struct device *dev)
        struct dw_mci_exynos_priv_data *priv = host->priv;
        u32 clksel;

+       pm_runtime_force_resume(dev);
+
        if (priv->ctrl_type == DW_MCI_TYPE_EXYNOS7 ||
                priv->ctrl_type == DW_MCI_TYPE_EXYNOS7_SMU)
                clksel = mci_readl(host, CLKSEL64);
@@ -209,9 +213,7 @@ static int dw_mci_exynos_resume_noirq(struct device *dev)

        return 0;
 }
-#else
-#define dw_mci_exynos_resume_noirq     NULL
-#endif /* CONFIG_PM */
+#endif /* CONFIG_PM_SLEEP */

 static void dw_mci_exynos_config_hs400(struct dw_mci *host, u32 timing)
 {
@@ -553,14 +555,11 @@ static int dw_mci_exynos_remove(struct
platform_device *pdev)
 }
 static const struct dev_pm_ops dw_mci_exynos_pmops = {
-       SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend,
-                               pm_runtime_force_resume)
+       SET_NOIRQ_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend,
+                               dw_mci_exynos_resume_noirq)
        SET_RUNTIME_PM_OPS(dw_mci_runtime_suspend,
                           dw_mci_exynos_runtime_resume,
                           NULL)
-       .resume_noirq = dw_mci_exynos_resume_noirq,
-       .thaw_noirq = dw_mci_exynos_resume_noirq,
-       .restore_noirq = dw_mci_exynos_resume_noirq,
 };

 static struct platform_driver dw_mci_exynos_pltfm_driver = {
-- 

Kind regards
Uffe
--
To unsubscribe from this list: send the line "unsubscribe linux-samsung-soc" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux SoC Development]     [Linux Rockchip Development]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Linux SCSI]     [Yosemite News]

  Powered by Linux