On 6/20/2023 1:50 PM, Limonciello, Mario wrote:
I have a suspicion on commit 7bcfdab3f0c6 ("HID: amd_sfh: if no
sensors
are enabled, clean up") because the stack trace says that there is
a bad
list_add, which could happen if the object is not correctly
initialized.
However, that commit was present in v6.2, so it might not be that
one.
If I'm not mistaken the Z13 doesn't actually have any
sensors connected to SFH. So I think the suspicion on
7bcfdab3f0c6 and theory this is triggered by HID init makes
a lot of sense.
Can you try this patch?
diff --git a/drivers/hid/amd-sfh-hid/amd_sfh_client.c
b/drivers/hid/amd-sfh-hid/amd_sfh_client.c
index d9b7b01900b5..fa693a5224c6 100644
--- a/drivers/hid/amd-sfh-hid/amd_sfh_client.c
+++ b/drivers/hid/amd-sfh-hid/amd_sfh_client.c
@@ -324,6 +324,7 @@ int amd_sfh_hid_client_init(struct amd_mp2_dev
*privdata)
devm_kfree(dev, cl_data->report_descr[i]);
}
dev_warn(dev, "Failed to discover, sensors not
enabled
is %d\n", cl_data->is_any_sensor_enabled);
+ cl_data->num_hid_devices = 0;
return -EOPNOTSUPP;
}
schedule_delayed_work(&cl_data->work_buffer,
msecs_to_jiffies(AMD_SFH_IDLE_LOOP));
I applied this to 9e87b63ed37e202c77aa17d4112da6ae0c7c097c now,
which was the
origin when I started the whole bisection. Clean rebuild, issue still
persists.
Out of 50 boots, I got:
25 clean
22 Oops as posted by the OP
1 same Oops, followed by a panic
1 lockup [1]
1 hanging with just a blank screen
Not sure whether the lockups are related, but [1] mentions modprobe
and udev-
worker as well and all problems including the blank screen one
appear roughly
at the same time during boot. As this is before a graphics mode
switch, I
suspect the last mentioned case may be like [1] while the screen was
blanked.
To support the timing correlation: the UVC error for the IR cam
shown in the
photo (normal boot noise) also appears right before the BUG in the
non-lockup
bad case.
I do see the dev_warn in dmesg, so the code path modified in your
patch is
indeed hit:
[ 10.897521] pcie_mp2_amd 0000:63:00.7: Failed to discover,
sensors not
enabled is 1
[ 10.897533] pcie_mp2_amd: probe of 0000:63:00.7 failed with error
-95
BR Malte
[1] https://photos.app.goo.gl/2FAvQ7DqBsHEF6Bd8
Apologies; for some reason I never got that above reply in my inbox,
some server along the way might have deemed it spam.
Anyways; I just double checked the Z13 I have on my hand. I don't
have the PCI device for SFH (1022:164a) present on the system.
Can you please double check you are on the latest BIOS?
I'm on the latest release from LVFS, 0.1.57 according to fwupdmgr.
Hopefully the newer BIOS fixes it for you, but if it doesn't I did come
up with another patch I've sent out that I guess could be another
solution.
https://lore.kernel.org/linux-input/20230620200117.22261-1-mario.limonciello@xxxxxxx/T/#u