Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting for once, to make this easily accessible to everyone. I still have this issue on my list of tracked regressions. Was this fixed in between? Doesn't look like it from here, but I might be missing something. Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat) -- Everything you wanna know about Linux kernel regression tracking: https://linux-regtracking.leemhuis.info/about/#tldr If I did something stupid, please tell me, as explained on that page. #regzbot poke On 23.07.23 16:32, Steven Rostedt wrote: > On Sun, 23 Jul 2023 20:55:06 +0900 > <kkabe@xxxxxxxxxxx> wrote: > >> So I tried to trap NULL and return: >> >> ================ patch-drm_vblank_cancel_pending_works-printk-NULL-ret.patch >> diff -up ./drivers/gpu/drm/drm_vblank_work.c.pk2 ./drivers/gpu/drm/drm_vblank_work.c >> --- ./drivers/gpu/drm/drm_vblank_work.c.pk2 2023-06-06 20:50:40.000000000 +0900 >> +++ ./drivers/gpu/drm/drm_vblank_work.c 2023-07-23 14:29:56.383093673 +0900 >> @@ -71,6 +71,10 @@ void drm_vblank_cancel_pending_works(str >> { >> struct drm_vblank_work *work, *next; >> >> + if (!vblank->dev) { >> + printk(KERN_WARNING "%s: vblank->dev == NULL? returning\n", __func__); >> + return; >> + } >> assert_spin_locked(&vblank->dev->event_lock); >> >> list_for_each_entry_safe(work, next, &vblank->pending_work, node) { >> ================ >> >> This time, the printk trap does not happen!! and radeon.ko works. >> (NULL check for vblank->worker is still fireing though) >> >> Now this is puzzling. >> Is this a timing issue? > > It could very well be. And the ftrace patch could possibly not be the > cause at all. But the thread that is created to do the work is causing > the race window to be opened up, which is why you see it with the patch > and don't without it. It may not be the problem, it may just tickle the > timings enough to trigger the bug, and is causing you to go on a wild > goose chase in the wrong direction. > > -- Steve > > >> Is systemd-udevd doing something not favaorble to kernel? >> Is drm vblank code running without enough initialization? >> >> Puzzling is, that purely useland activity >> (logging in on tty1 before radeon.ko load) >> is affecting kernel panic/no-panic. > > >