On Thu, Nov 02, 2023 at 02:11:49AM +0100, Armin Wolf wrote: > Am 01.11.23 um 05:34 schrieb James Seo: > >> On Tue, Oct 31, 2023 at 11:34:16PM +0100, Armin Wolf wrote: >>> Am 31.10.23 um 22:07 schrieb Lukasz Stelmach: >>> >>>> It was <2023-10-31 wto 12:28>, when Guenter Roeck wrote: >>>>> On 10/31/23 12:07, Lukasz Stelmach wrote: >>>>> >>>>> [ ... ] >>>>> >>>>>>> For what it's worth, I personally don't see much value in doing much >>>>>>> more than a machine-limited workaround for now. To me it's clear that >>>>>>> this UTF-16 corner case is a BIOS bug and its consequences are minimal >>>>>>> once a workaround is in place. >>>>>>> >>>>>>> Thoughts? >>> I think this is no BIOS bug, but valid behavior since the Windows ACPI-WMI mapper >>> converts the ACPI objects into a common buffer format as described here: >>> >>> https://learn.microsoft.com/en-us/windows-hardware/drivers/kernel/driver-defined-wmi-data-items >>> >> Hi Armin, >> >> I did see this link when you mentioned it earlier, and I understand that it's >> specifying the packed and internally aligned buffer format that WMI on Windows >> expects when a Windows driver provides a WMI data block. >> >> This to me is a different question from whether an ACPI object in the BIOS, >> one which will be converted to a WMI object by Windows later, should contain >> UTF-16. I didn't find a single other example in all the ACPI dumps in the Linux >> Hardware Database [1] of such an ACPI object. >> >> So the answer to the question seems like a "SHOULD NOT". And someone at HP >> definitely did a bad copy-paste when it came to this BIOS. I feel comfortable >> calling it a bug (the leading "4" makes it one in any case). > > Not exactly, the leading "4" is a hack: > > The string "4BIOS Configuration Change" get converted to "\u0034 \u0042 \u0049 \u004f \u0053 \u0020 \u0043 \u006f \u006e \u0066 \u0069 \u0067 > \u0075 \u0072 \u0061 \u0074 \u0069 \u006f \u006e \u0020 \u0043 \u0068 \u0061 \u006e \u0067 \u0065 \u0000", > so without the leading "4", the utf16 string is 52 bytes long. > > Since WMI want the utf16 string prefixed with its length (52 = 0x34), the leading "4" was added > since the letter "4" gets converted to \u0034 (0x34 = 52). > > So for WMI, the leading "4" gets interpreted as the length of the following utf16 string, > so it is not displayed to WMI data consumers. I see. So the ACPI-WMI mapper doesn't really "handle" UTF-16, then. Not in the same way it does UTF-8. I can't remember where, but I know I've dealt with these length-prefixed UTF-16 strings before when touching other Microsoft offerings. So maybe this string was originally inserted as 'Buffer (0x00,0x34,0x00,0x42,...0x00,0x00)' and 'Unicode ("4BIOS...")' [1] is just how iasl decompiles it. Either way, someone at HP realized all of this, and instead of just changing this one string back to a regular string like every other similar string in the same BIOS - and probably every other BIOS they'd ever seen - kept it. On purpose. Sigh. >> And now I'm thinking out loud, but if WMI doesn't allow arbitrary binary data >> (and from the WMI buffer spec you linked, it doesn't), and the Windows ACPI-WMI >> mapper can indeed handle UTF-16, then ACPI_TYPE_BUFFER in ACPI objects intended >> to become WMI objects can only contain UTF-16. > > On my machine (Dell Inspiron 3505), an ACPI WMI method uses a ACPI_TYPE_BUFFER > to pass an array of u32, so assuming ACPI_TYPE_BUFFER == utf16 is false. > > This shows why linux should do this preprocessing too, because WMI marshals > the _converted_ buffer using the MOF definitions, not the ACPI bobject. > > According to Microsoft documentation, the ACPI-WMI mapper converts the ACPI object > without having _any_ knowledge of the corresponding MOF definitions. This means > that after conversion, the contents of the WMI buffer are only "normalized" so > the WMI core does not have to know about ACPI data types. Interesting. These HP machines represent a WMI array of strings as individual ACPI String elements in an ACPI Package, so I assumed arrays of other types would be individual elements as well. > If linux would also do this "normalization" step, then we would also be able to > operate on this "normalized" WMI buffer without relying on the ACPI data types, > which (like in this case) can use different ways to express the same WMI data > (string as ACPI string vs string as utf16 buffer). Thanks for your work on the WMI subsystem side. Looking forward to when this, BMOF decoding, and an unmarshaling mechanism are in the kernel and drivers can work with WMI data types only. [1] https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/19_ASL_Reference/ACPI_Source_Language_Reference.html#unicode-string-to-unicode-conversion-macro