Hi Artem, On Sun, Oct 2, 2022 at 10:23 AM Artem S. Tashkinov <aros@xxxxxxx> wrote: > On 10/2/22 07:37, Takashi Iwai wrote: > > On Sat, 01 Oct 2022 12:30:22 +0200, > > Artem S. Tashkinov wrote: > >> Here's another one which is outright puzzling: > >> > >> You run: dmesg -t --level=emerg,crit,err > >> > >> And you see some non-descript errors of some kernel subsystems seemingly > >> failing or being unhappy about your hardware. Errors are as cryptic as > >> humanly possible, you don't even know what part of kernel has produced them. > >> > >> OK, as a "power" user I download the kernel source, run `grep -R message > >> /tmp/linux-5.19` and there are _multiple_ different modules and places > >> which contain this message. > >> > >> I'm lost. Send this to LKML? Did that in the long past, no one cared, I > >> stopped. > >> > >> Here's what I'm getting with Linux 5.19.12: > >> > >> platform wdat_wdt: failed to claim resource 5: [mem > >> 0x00000000-0xffffffff7fffffff] > >> ACPI: watchdog: Device creation failed: -16 > >> ACPI BIOS Error (bug): Could not resolve symbol > >> [\_SB.PCI0.XHC.RHUB.TPLD], AE_NOT_FOUND (20220331/psargs-330) > >> ACPI Error: Aborting method \_SB.UBTC.CR01._PLD due to previous error > >> (AE_NOT_FOUND) (20220331/psparse-529) > >> platform MSFT0101:00: failed to claim resource 1: [mem > >> 0xfed40000-0xfed40fff] > >> acpi MSFT0101:00: platform device creation failed: -16 > >> lis3lv02d: unknown sensor type 0x0 > >> > >> Are they serious? Should they be reported or not? Is my laptop properly > >> working? I have no clue at all. > > > > That's a dilemma. The kernel can't know whether it's "properly" > > working, either -- that is, whether the lack of some functions matters > > for you or not. In your case above, it's about a watchdog, something > > related with USB, TPM, and acceleration sensor, all of which likely > > come from a buggy BIOS. Would you mind if those features are missing? > > Or even whether your device has a correct hardware implementation? > > Kernel doesn't know, hence it complains as an error. > > > > In many drivers, there are mechanisms to shut off superfluous error > > messages for known devices. So it's case-by-case solutions. > > > > Or you can completely hide those errors at boot by a boot option > > (e.g. loglevel=2). > > The problem is some of such messages are indeed indicative of certain > real issues which result in HW not working properly, including: > > 1) missing/incorrect firmware > 2) most importantly: not enabled power saving modes > 3) not enabled high performance modes > 4) not enabled devices > 5) not enabled devices' functions > 6) drivers conflicts (i.e. the wrong module gets loaded for the device) > 7) physically failing hardware > > I'm quite sure you don't really know what half of those messages > actually mean. > > Speaking of 7. Various kernel subsystems/drivers deal with e.g. mass > storage which is known to fail quite often. There's not a single driver > in the kernel which is actually brave enough to spew something like this: > > "/dev/xxxx might be failing, please RMA or seek help online" > > instead you get a dmesg choke full of "unable to read sector XXX" or > something like that. > > To return to the previous errors: it's impossible for the user to assess > their severity and that sucks. What is "platform device creation > failed"? What is "unknown sensor type"? What am I missing? Who's > responsible? The kernel? My HW vendor? Are those errors actionable? In > my understanding a properly working computer must not produce > "emerg,crit,err" errors. I'm not even talking about "warn,info" and such. I am afraid that for most of the above, the kernel cannot know the answer. Hence more investigation/debugging is needed. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds