On Thu, Aug 17, 2023 at 4:37 AM Shreeya Patel <shreeya.patel@xxxxxxxxxxxxx> wrote: > > Hi Greg, > > On 16/08/23 20:33, Greg Kroah-Hartman wrote: > > On Wed, Aug 16, 2023 at 03:09:27PM +0530, Shreeya Patel wrote: > >> On 13/06/22 15:40, Greg Kroah-Hartman wrote: > >>> From: Saravana Kannan<saravanak@xxxxxxxxxx> > >>> > >>> [ Upstream commit 5ee76c256e928455212ab759c51d198fedbe7523 ] > >>> > >>> Mounting NFS rootfs was timing out when deferred_probe_timeout was > >>> non-zero [1]. This was because ip_auto_config() initcall times out > >>> waiting for the network interfaces to show up when > >>> deferred_probe_timeout was non-zero. While ip_auto_config() calls > >>> wait_for_device_probe() to make sure any currently running deferred > >>> probe work or asynchronous probe finishes, that wasn't sufficient to > >>> account for devices being deferred until deferred_probe_timeout. > >>> > >>> Commit 35a672363ab3 ("driver core: Ensure wait_for_device_probe() waits > >>> until the deferred_probe_timeout fires") tried to fix that by making > >>> sure wait_for_device_probe() waits for deferred_probe_timeout to expire > >>> before returning. > >>> > >>> However, if wait_for_device_probe() is called from the kernel_init() > >>> context: > >>> > >>> - Before deferred_probe_initcall() [2], it causes the boot process to > >>> hang due to a deadlock. > >>> > >>> - After deferred_probe_initcall() [3], it blocks kernel_init() from > >>> continuing till deferred_probe_timeout expires and beats the point of > >>> deferred_probe_timeout that's trying to wait for userspace to load > >>> modules. > >>> > >>> Neither of this is good. So revert the changes to > >>> wait_for_device_probe(). > >>> > >>> [1] -https://lore.kernel.org/lkml/TYAPR01MB45443DF63B9EF29054F7C41FD8C60@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/ > >>> [2] -https://lore.kernel.org/lkml/YowHNo4sBjr9ijZr@dev-arch.thelio-3990X/ > >>> [3] -https://lore.kernel.org/lkml/Yo3WvGnNk3LvLb7R@xxxxxxxxxxxxx/ > >> Hi Saravana, Greg, > >> > >> > >> KernelCI found this patch causes the baseline.bootrr.deferred-probe-empty test to fail on r8a77960-ulcb, > >> see the following details for more information. > >> > >> KernelCI dashboard link: > >> https://linux.kernelci.org/test/plan/id/64d2a6be8c1a8435e535b264/ > >> > >> Error messages from the logs :- > >> > >> + UUID=11236495_1.5.2.4.5 > >> + set +x > >> + export 'PATH=/opt/bootrr/libexec/bootrr/helpers:/lava-11236495/1/../bin:/sbin:/usr/sbin:/bin:/usr/bin' > >> + cd /opt/bootrr/libexec/bootrr > >> + sh helpers/bootrr-auto > >> e6800000.ethernet > >> e6700000.dma-controller > >> e7300000.dma-controller > >> e7310000.dma-controller > >> ec700000.dma-controller > >> ec720000.dma-controller > >> fea20000.vsp > >> feb00000.display > >> fea28000.vsp > >> fea30000.vsp > >> fe9a0000.vsp > >> fe9af000.fcp > >> fea27000.fcp > >> fea2f000.fcp > >> fea37000.fcp > >> sound > >> ee100000.mmc > >> ee140000.mmc > >> ec500000.sound > >> /lava-11236495/1/../bin/lava-test-case > >> <8>[ 17.476741] <LAVA_SIGNAL_TESTCASE TEST_CASE_ID=deferred-probe-empty RESULT=fail> > >> > >> Test case failing :- > >> Baseline Bootrr deferred-probe-empty test -https://github.com/kernelci/bootrr/blob/main/helpers/bootrr-generic-tests > >> > >> Regression Reproduced :- > >> > >> Lava job after reverting the commit 5ee76c256e92 > >> https://lava.collabora.dev/scheduler/job/11292890 > >> > >> > >> Bisection report from KernelCI can be found at the bottom of the email. > >> > >> Thanks, > >> Shreeya Patel > >> > >> #regzbot introduced: 5ee76c256e92 > >> #regzbot title: KernelCI: Multiple devices deferring on r8a77960-ulcb > >> > >> --------------------------------------------------------------------------------------------------------------------------------------------------- > >> > >> * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * ** > >> * If you do send a fix, please include this trailer: * > >> * Reported-by: "kernelci.org bot" <bot@...> * > >> * * > >> * Hope this helps! * > >> * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * > >> > >> stable-rc/linux-5.10.y bisection: baseline.bootrr.deferred-probe-empty on > >> r8a77960-ulcb > > You are testing 5.10.y, yet the subject says 5.17? > > > > Which is it here? > > Sorry, I accidentally used the lore link for 5.17 while reporting this > issue, > but this test does fail on all the stable releases from 5.10 onwards. > > stable 5.15 :- > https://linux.kernelci.org/test/case/id/64dd156a5ac58d0cf335b1ea/ > mainline :- > https://linux.kernelci.org/test/case/id/64dc13d55cb51357a135b209/ > Shreeya, can you try the patch Geert suggested and let us know if it helps? If not, then I can try to take a closer look. -Saravana