Re: Kernel segmentation fault unbinding eMMC on AML-S905X-CC

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jan 10, 2019 at 1:59 PM Remi Pommarel <repk@xxxxxxxxxxxx> wrote:
>
> On Wed, Jan 09, 2019 at 10:48:50PM +0000, Elie Roudninski wrote:
> ...
>
> > [  444.060161] Call trace:
> > [  444.062580]  queued_spin_lock_slowpath+0x228/0x2d0
> > [  444.067322]  meson_mmc_irq+0x290/0x2a0
> > [  444.071032]  __free_irq+0x184/0x318
> > [  444.074480]  free_irq+0x40/0x80
> > [  444.077586]  devm_irq_release+0x24/0x30
> > [  444.081382]  release_nodes+0x1e0/0x2e0
> > [  444.085089]  devres_release_all+0x60/0x88
> > [  444.089057]  device_release_driver_internal+0x1c8/0x248
> > [  444.094231]  device_release_driver+0x28/0x38
> > [  444.098459]  unbind_store+0xdc/0x148
> > [  444.101994]  drv_attr_store+0x40/0x58
> > [  444.105627]  sysfs_kf_write+0x5c/0x78
> > [  444.109240]  kernfs_fop_write+0xe8/0x1e0
> > [  444.113126]  __vfs_write+0x60/0x190
> > [  444.116570]  vfs_write+0xac/0x1b0
> > [  444.119848]  ksys_write+0x6c/0xd0
> > [  444.123125]  __arm64_sys_write+0x24/0x30
> > [  444.127015]  el0_svc_common+0x94/0xe8
> > [  444.130629]  el0_svc_handler+0x38/0x80
> > [  444.134340]  el0_svc+0x8/0xc
> > [  444.137187] Code: d37c0401 910020c0 8b010041 f865d8e5 (f8256826)
> > [  444.143223] ---[ end trace 62541ec2ffe020f2 ]---
> > Segmentation fault
>
> ...
>
> Here are my two cents on this.
>
> The irq that is being freed was allocated through device management
> (i.e. with devm_request_threaded_irq()). So this irq will be released
> after meson_mmc_remove() completion, thus after that mmc_free_host()
> has been called. Because this irq has been registered with IRQF_SHARED
> and if you have CONFIG_DEBUG_SHIRQ (do you ?) the irq handler will be
Yes, you are right. I have CONFIG_DEBUG_SHIRQ enabled.
> called while the irq is freed.
>
> So meson_mmc_irq could be called after mmc_free_host() has freed the
> meson_host pointer causing it to dereference an invalid pointer.
>
> So if you have CONFIG_DEBUG_SHIRQ could you please test the following
> patch:
>
> ------------------------------- 8< ----------------------------------
> diff --git a/drivers/mmc/host/meson-gx-mmc.c b/drivers/mmc/host/meson-gx-mmc.c
> index c2690c1a50ff..3a9679531520 100644
> --- a/drivers/mmc/host/meson-gx-mmc.c
> +++ b/drivers/mmc/host/meson-gx-mmc.c
> @@ -179,6 +179,8 @@ struct meson_host {
>         struct sd_emmc_desc *descs;
>         dma_addr_t descs_dma_addr;
>
> +       int irq;
> +
>         bool vqmmc_enabled;
>  };
>
> @@ -1231,7 +1233,7 @@ static int meson_mmc_probe(struct platform_device *pdev)
>         struct resource *res;
>         struct meson_host *host;
>         struct mmc_host *mmc;
> -       int ret, irq;
> +       int ret;
>
>         mmc = mmc_alloc_host(sizeof(struct meson_host), &pdev->dev);
>         if (!mmc)
> @@ -1276,8 +1278,8 @@ static int meson_mmc_probe(struct platform_device *pdev)
>                 goto free_host;
>         }
>
> -       irq = platform_get_irq(pdev, 0);
> -       if (irq <= 0) {
> +       host->irq = platform_get_irq(pdev, 0);
> +       if (host->irq <= 0) {
>                 dev_err(&pdev->dev, "failed to get interrupt resource.\n");
>                 ret = -EINVAL;
>                 goto free_host;
> @@ -1331,9 +1333,8 @@ static int meson_mmc_probe(struct platform_device *pdev)
>         writel(IRQ_CRC_ERR | IRQ_TIMEOUTS | IRQ_END_OF_CHAIN,
>                host->regs + SD_EMMC_IRQ_EN);
>
> -       ret = devm_request_threaded_irq(&pdev->dev, irq, meson_mmc_irq,
> -                                       meson_mmc_irq_thread, IRQF_SHARED,
> -                                       NULL, host);
> +       ret = request_threaded_irq(host->irq, meson_mmc_irq,
> +                       meson_mmc_irq_thread, IRQF_SHARED, NULL, host);
>         if (ret)
>                 goto err_init_clk;
>
> @@ -1387,6 +1388,7 @@ static int meson_mmc_remove(struct platform_device *pdev)
>
>         /* disable interrupts */
>         writel(0, host->regs + SD_EMMC_IRQ_EN);
> +       free_irq(host->irq, host);
>
>         dma_free_coherent(host->dev, SD_EMMC_DESC_BUF_LEN,
>                           host->descs, host->descs_dma_addr);
> ------------------------------- 8< ----------------------------------
>
> This basically freed the irq in the meson_mmc_remove() so the
> meson_mmc_irq() is called before mmc_free_host().
Hello and thank you for looking into this!

Your patch seems to work. I still got a stacktrace but not a segfault
anymore, and it appears to be related to something else:
[  148.133895] WARNING: CPU: 1 PID: 580 at
drivers/mmc/host/meson-gx-mmc.c:1027 meson_mmc_irq+0x210/0x2a0
[  148.137535] Modules linked in: cfg80211 rfkill 8021q garp mrp stp
llc dw_hdmi_cec meson_dw_hdmi dw_hdmi meson_drm dwmac_generic
drm_kms_helper dwmac_meson8b drm crct10dif_ce stmmac_platform stmmac
drm_panel_orientation_quirks meson_ir ptp syscopyarea rc_core
sysfillrect sysimgblt fb_sys_fops pps_core meson_canvas meson_gxbb_wdt
[  148.166345] CPU: 1 PID: 580 Comm: bash Tainted: G        W
4.20.0_2 #1
[  148.173498] Hardware name: amlogic p212/p212, BIOS
2019.01-rc2-00176-gf97c49d6a2 01/10/2019
[  148.181780] pstate: 60400085 (nZCv daIf +PAN -UAO)
[  148.186527] pc : meson_mmc_irq+0x210/0x2a0
[  148.190586] lr : __free_irq+0x184/0x318
[  148.194370] sp : ffff0000108efac0
[  148.197647] x29: ffff0000108efac0 x28: ffff800005cfb600
[  148.202908] x27: 0000000000000000 x26: 0000000000000000
[  148.208169] x25: 0000000000000000 x24: ffff8000105e8228
[  148.213431] x23: 0000000000000013 x22: ffff8000105e8358
[  148.218692] x21: ffff800005e61f00 x20: ffff8000105e8200
[  148.223953] x19: ffff800005e61f00 x18: 0000000000000000
[  148.229214] x17: 0000000000000000 x16: 0000000000000000
[  148.234476] x15: ffffffffffffffff x14: 6d6d2e3030303437
[  148.239737] x13: 3030642f6270612e x12: 0000000000000040
[  148.244998] x11: 0000000000000228 x10: 0000000000000040
[  148.250259] x9 : ffff000009fa9de8 x8 : ffff000009fa9de0
[  148.255521] x7 : ffff800007e7e050 x6 : 0000000000000000
[  148.260782] x5 : ffff800007e7df90 x4 : 0000000000000000
[  148.266043] x3 : 0000000000000000 x2 : ffff000008b90080
[  148.271304] x1 : ffff800005e61f00 x0 : 0000000000000000
[  148.276566] Call trace:
[  148.278984]  meson_mmc_irq+0x210/0x2a0
[  148.282692]  __free_irq+0x184/0x318
[  148.286142]  free_irq+0x40/0x80
[  148.289247]  meson_mmc_remove+0x48/0x190
[  148.293130]  platform_drv_remove+0x30/0x50
[  148.297190]  device_release_driver_internal+0x1b0/0x248
[  148.302357]  device_release_driver+0x28/0x38
[  148.306583]  unbind_store+0xdc/0x148
[  148.310119]  drv_attr_store+0x40/0x58
[  148.313743]  sysfs_kf_write+0x5c/0x78
[  148.317364]  kernfs_fop_write+0xe8/0x1e0
[  148.321247]  __vfs_write+0x60/0x190
[  148.324695]  vfs_write+0xac/0x1b0
[  148.327972]  ksys_write+0x6c/0xd0
[  148.331250]  __arm64_sys_write+0x24/0x30
[  148.335133]  el0_svc_common+0x94/0xe8
[  148.338754]  el0_svc_handler+0x38/0x80
[  148.342463]  el0_svc+0x8/0xc
[  148.345308] ---[ end trace eb849c1a635abc72 ]---
[  148.350408] meson-gx-mmc d0074000.mmc: Dropping the link to regulator.3
[  148.356575] meson-gx-mmc d0074000.mmc: Dropping the link to regulator.1

Moreover when I bind the eMMC after the unbind, Linux detects it
properly, its a huge improvement!
Thank you very much!

I don't know if jbrunet's patch about removing the "non-removable"
property should allow Linux to detects it automatically when plugged
without unbind/bind, but this part is not working.

Regards,

Elie
>
> --
> Rémi Pommarel




[Index of Archives]     [Linux Memonry Technology]     [Linux USB Devel]     [Linux Media]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux