Am Dienstag, 6. Juni 2023, 17:25:13 CEST schrieb Limonciello, Mario: > On 6/6/2023 3:08 AM, Benjamin Tissoires wrote: > > On Jun 06 2023, Linux regression tracking (Thorsten Leemhuis) wrote: > >>> On Mon, Jun 05, 2023 at 01:24:25PM +0200, Malte Starostik wrote: > >>>> Hello, > >>>> > >>>> chiming in here as I'm experiencing what looks like the exact same > >>>> issue, also on a Lenovo Z13 notebook, also on Arch: > >>>> Oops during startup in task udev-worker followed by udev-worker > >>>> blocking all attempts to suspend or cleanly shutdown/reboot the > >>>> machine > > I have a suspicion on commit 7bcfdab3f0c6 ("HID: amd_sfh: if no sensors > > are enabled, clean up") because the stack trace says that there is a bad > > list_add, which could happen if the object is not correctly initialized. > > > > However, that commit was present in v6.2, so it might not be that one. > > > If I'm not mistaken the Z13 doesn't actually have any > sensors connected to SFH. So I think the suspicion on > 7bcfdab3f0c6 and theory this is triggered by HID init makes > a lot of sense. > > Can you try this patch? > > diff --git a/drivers/hid/amd-sfh-hid/amd_sfh_client.c > b/drivers/hid/amd-sfh-hid/amd_sfh_client.c > index d9b7b01900b5..fa693a5224c6 100644 > --- a/drivers/hid/amd-sfh-hid/amd_sfh_client.c > +++ b/drivers/hid/amd-sfh-hid/amd_sfh_client.c > @@ -324,6 +324,7 @@ int amd_sfh_hid_client_init(struct amd_mp2_dev > *privdata) > devm_kfree(dev, cl_data->report_descr[i]); > } > dev_warn(dev, "Failed to discover, sensors not enabled > is %d\n", cl_data->is_any_sensor_enabled); > + cl_data->num_hid_devices = 0; > return -EOPNOTSUPP; > } > schedule_delayed_work(&cl_data->work_buffer, > msecs_to_jiffies(AMD_SFH_IDLE_LOOP)); I applied this to 9e87b63ed37e202c77aa17d4112da6ae0c7c097c now, which was the origin when I started the whole bisection. Clean rebuild, issue still persists. Out of 50 boots, I got: 25 clean 22 Oops as posted by the OP 1 same Oops, followed by a panic 1 lockup [1] 1 hanging with just a blank screen Not sure whether the lockups are related, but [1] mentions modprobe and udev- worker as well and all problems including the blank screen one appear roughly at the same time during boot. As this is before a graphics mode switch, I suspect the last mentioned case may be like [1] while the screen was blanked. To support the timing correlation: the UVC error for the IR cam shown in the photo (normal boot noise) also appears right before the BUG in the non-lockup bad case. I do see the dev_warn in dmesg, so the code path modified in your patch is indeed hit: [ 10.897521] pcie_mp2_amd 0000:63:00.7: Failed to discover, sensors not enabled is 1 [ 10.897533] pcie_mp2_amd: probe of 0000:63:00.7 failed with error -95 BR Malte [1] https://photos.app.goo.gl/2FAvQ7DqBsHEF6Bd8