On 04/07/2024 09:57, Daniel Lezcano wrote:
On 04/07/2024 09:39, neil.armstrong@xxxxxxxxxx wrote:
[ ... ]
OK I just found out, it's the `qcom-battmgr-bat` thermal zone, and in CI we do not have the firmwares so the
temperature is never available, this is why it fails in a loop.
Before this patch it would fail silently, but would be useless if we start the firmware too late.
So since it's firmware based, valid data could arrive very late in the boot stage, and sending an
error message in a loop until the firmware isn't started doesn't seem right.
Yeah, there was a similar bug with iwlwifi. They fixed it by registering the thermal zone after the firmware was successfully loaded.
Is that possible to do the same ?
The thermal zone is indirect, it's registered via power_supply_core.
A tentative was done to delay registering the power supply , since it caused issues in suspend/resume,
but it was reverted because it would require much more work:
https://lore.kernel.org/all/20240123160053.18331-1-johan+linaro@xxxxxxxxxx/
Seems we should instead return -EAGAIN instead of -ENODEV in qcom_battmgr_bat_get_property(),
But I think power_supply_read_temp() should return -EAGAIN on -ENODEV, since it's the return
code for when a power supply isn't initialized.
Neil
I think Rafael's new patch is good, but perhaps it should send an error when it finally stops monitoring.