Re: [PATCH v2 0/9] deferred_probe_timeout logic clean up

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Saravana,

On Wed, Jun 1, 2022 at 12:46 PM Saravana Kannan <saravanak@xxxxxxxxxx> wrote:
> This series is based on linux-next + these 2 small patches applies on top:
> https://lore.kernel.org/lkml/20220526034609.480766-1-saravanak@xxxxxxxxxx/
>
> A lot of the deferred_probe_timeout logic is redundant with
> fw_devlink=on.  Also, enabling deferred_probe_timeout by default breaks
> a few cases.
>
> This series tries to delete the redundant logic, simplify the frameworks
> that use driver_deferred_probe_check_state(), enable
> deferred_probe_timeout=10 by default, and fixes the nfsroot failure
> case.
>
> The overall idea of this series is to replace the global behavior of
> driver_deferred_probe_check_state() where all devices give up waiting on
> supplier at the same time with a more granular behavior:
>
> 1. Devices with all their suppliers successfully probed by late_initcall
>    probe as usual and avoid unnecessary deferred probe attempts.
>
> 2. At or after late_initcall, in cases where boot would break because of
>    fw_devlink=on being strict about the ordering, we
>
>    a. Temporarily relax the enforcement to probe any unprobed devices
>       that can probe successfully in the current state of the system.
>       For example, when we boot with a NFS rootfs and no network device
>       has probed.
>    b. Go back to enforcing the ordering for any devices that haven't
>       probed.
>
> 3. After deferred probe timeout expires, we permanently give up waiting
>    on supplier devices without drivers. At this point, whatever devices
>    can probe without some of their optional suppliers end up probing.
>
> In the case where module support is disabled, it's fairly
> straightforward and all device probes are completed before the initcalls
> are done.
>
> Patches 1 to 3 are fairly straightforward and can probably be applied
> right away.
>
> Patches 4 to 6 are for fixing the NFS rootfs issue and setting the
> default deferred_probe_timeout back to 10 seconds when modules are
> enabled.
>
> Patches 7 to 9 are further clean up of the deferred_probe_timeout logic
> so that no framework has to know/care about deferred_probe_timeout.
>
> Yoshihiro/Geert,
>
> If you can test this patch series and confirm that the NFS root case
> works, I'd really appreciate that.

Thanks, I gave this a try on various boards I have access to.
The results were quite positive. E.g. the compile error I saw on v1
(implicit declation of fw_devlink_unblock_may_probe(), which is no longer
 used in v2) is gone.

However, I'm seeing a weird error when userspace (Debian9 nfsroot) is
starting:

    [  OK  ] Started D-Bus System Message Bus.
    Unable to handle kernel NULL pointer dereference at virtual
address 0000000000000000
    Unable to handle kernel NULL pointer dereference at virtual
address 0000000000000000
    Mem abort info:
      ESR = 0x0000000096000004
    Mem abort info:
      ESR = 0x0000000096000004
      EC = 0x25: DABT (current EL), IL = 32 bits
      SET = 0, FnV = 0
      EC = 0x25: DABT (current EL), IL = 32 bits
      EA = 0, S1PTW = 0
      FSC = 0x04: level 0 translation fault
      SET = 0, FnV = 0
    Data abort info:
      ISV = 0, ISS = 0x00000004
      EA = 0, S1PTW = 0
      FSC = 0x04: level 0 translation fault
      CM = 0, WnR = 0
    user pgtable: 4k pages, 48-bit VAs, pgdp=000000004ec45000
    [0000000000000000] pgd=0000000000000000, p4d=0000000000000000
    Data abort info:
    Internal error: Oops: 96000004 [#1] PREEMPT SMP
    CPU: 0 PID: 374 Comm: v4l_id Tainted: G        W
5.19.0-rc1-arm64-renesas-00799-gc13c3e49e8bd #1660
      ISV = 0, ISS = 0x00000004
    Hardware name: Renesas Ebisu-4D board based on r8a77990 (DT)
    pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
      CM = 0, WnR = 0
    pc : subdev_open+0x8c/0x128
    lr : subdev_open+0x78/0x128
    sp : ffff80000aadba60
    x29: ffff80000aadba60 x28: 0000000000000000 x27: ffff80000aadbc58
    x26: 0000000000020000 x25: ffff00000b3aaf00 x24: 0000000000000000
    x23: ffff00000c331c00 x22: ffff000009aa61b8 x21: ffff000009aa6000
    x20: ffff000008bae3e8 x19: ffff00000c3fe200 x18: 0000000000000000
    x17: ffff800076945000 x16: ffff800008004000 x15: 00008cc6bf550c7c
    x14: 000000000000038f x13: 000000000000001a x12: ffff00007fba8618
    x11: 0000000000000001 x10: 0000000000000000 x9 : ffff800009253954
    x8 : ffff00000b3aaf00 x7 : 0000000000000004 x6 : 000000000000001a
    x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000001
    x2 : 0000000100000001 x1 : 0000000000000000 x0 : 0000000000000000
    Call trace:
     subdev_open+0x8c/0x128
     v4l2_open+0xa4/0x120
     chrdev_open+0x78/0x178
     do_dentry_open+0xfc/0x398
     vfs_open+0x28/0x30
     path_openat+0x584/0x9c8
     do_filp_open+0x80/0x108
     do_sys_openat2+0x20c/0x2d8
    user pgtable: 4k pages, 48-bit VAs, pgdp=000000004ec53000
     do_sys_open+0x54/0xa0
     __arm64_sys_openat+0x20/0x28
     invoke_syscall+0x40/0xf8
     el0_svc_common.constprop.0+0xf0/0x110
     do_el0_svc+0x20/0x78
     el0_svc+0x48/0xd0
     el0t_64_sync_handler+0xb0/0xb8
     el0t_64_sync+0x148/0x14c
    Code: f9405280 f9400400 b40000e0 f9400280 (f9400000)
    ---[ end trace 0000000000000000 ]---

This only happens on the Ebisu-4D board (r8a77990-ebisu.dts).
I do not see this on the Salvator-X(S) boards.

Bisection shows this starts to happen with "[PATCH v2 7/9] driver core:
Set fw_devlink.strict=1 by default".

Adding more debug info:

    subdev_open:54: file v4l-subdev1
    Unable to handle kernel NULL pointer dereference at virtual
address 0000000000000000
    subdev_open:54: file v4l-subdev2
    Unable to handle kernel NULL pointer dereference at virtual
address 0000000000000000

Matching the subdev using sysfs gives:

    /sys/devices/platform/soc/e6500000.i2c/i2c-0/0-0070/video4linux/v4l-subdev1
    /sys/devices/platform/soc/e6500000.i2c/i2c-0/0-0070/video4linux/v4l-subdev2

The i2c device is the adi,adv7482 at address 0x70.
But now I'm lost...

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds



[Index of Archives]     [Linux SPI]     [Linux Kernel]     [Linux ARM (vger)]     [Linux ARM MSM]     [Linux Omap]     [Linux Arm]     [Linux Tegra]     [Fedora ARM]     [Linux for Samsung SOC]     [eCos]     [Linux Fastboot]     [Gcc Help]     [Git]     [DCCP]     [IETF Announce]     [Security]     [Linux MIPS]     [Yosemite Campsites]

  Powered by Linux