Hi, Bjorn, Rafael, Happy New Year :) I am wondering that do you have any comment to this implementation? Any feedback are welcomed. Thank you. Best regrads, Shuai 在 2021/12/24 PM4:55, Shuai Xue 写道: > Hi, Bjorn, > > Thank you for your comments. > > 在 2021/12/24 AM8:17, Bjorn Helgaas 写道: >> [+to Rafael, question about HEST/GHES/SDEI init] >> >> On Thu, Dec 23, 2021 at 04:11:11PM +0800, Shuai Xue wrote: >>> 在 2021/12/22 AM7:17, Bjorn Helgaas 写道: >>>> On Thu, Dec 16, 2021 at 09:34:56PM +0800, Shuai Xue wrote: >>>>> On an ACPI system, ACPI is initialised very early from a subsys_initcall(), >>>>> while SDEI is not ready until a subsys_initcall_sync(). >>>>> >>>>> The SDEI driver provides functions (e.g. apei_sdei_register_ghes, >>>>> apei_sdei_unregister_ghes) to register or unregister event callback for >>>>> dispatcher in firmware. When the GHES driver probing, it registers the >>>>> corresponding callback according to the notification type specified by >>>>> GHES. If the GHES notification type is SDEI, the GHES driver will call >>>>> apei_sdei_register_ghes to register event call. >>>>> >>>>> When the firmware emits an event, it migrates the handling of the event >>>>> into the kernel at the registered entry-point __sdei_asm_handler. And >>>>> finally, the kernel will call the registered event callback and return >>>>> status_code to indicate the status of event handling. SDEI_EV_FAILED >>>>> indicates that the kernel failed to handle the event. >>>>> >>>>> Consequently, when an error occurs during kernel booting, the kernel is >>>>> unable to handle and report errors until the GHES driver is initialized by >>>>> device_initcall(), in which the event callback is registered. All errors >>>>> that occurred before GHES initialization are missed and there is no chance >>>>> to report and find them again. >>>>> >>>>> From commit e147133a42cb ("ACPI / APEI: Make hest.c manage the estatus >>>>> memory pool") was merged, ghes_init() relies on acpi_hest_init() to manage >>>>> the estatus memory pool. On the other hand, ghes_init() relies on >>>>> sdei_init() to detect the SDEI version and the framework for registering >>>>> and unregistering events. >>>> >>>>> By the way, I don't figure out why acpi_hest_init is called in >>>>> acpi_pci_root_init, it don't rely on any other thing. May it could >>>>> be moved further, following acpi_iort_init in acpi_init. >> >>>>> sdei_init() relies on ACPI table which is initialized >>>>> subsys_initcall(): acpi_init(), acpi_bus_init(), acpi_load_tables(), >>>>> acpi_tb_laod_namespace(). May it should be also moved further, >>>>> after acpi_load_tables. >> >>>>> In this patch, move sdei_init and ghes_init as far ahead as >>>>> possible, right after acpi_hest_init(). >>>> >>>> I'm having a hard time figuring out the reason for this patch. >>>> >>>> Apparently the relevant parts are sdei_init() and ghes_init(). >>>> Today they are executed in that order: >>>> >>>> subsys_initcall_sync(sdei_init); >>>> device_initcall(ghes_init); >>>> >>>> After this patch, they would be executed in the same order, but called >>>> explicitly instead of as initcalls: >>>> >>>> acpi_pci_root_init() >>>> { >>>> acpi_hest_init(); >>>> sdei_init(); >>>> ghes_init(); >>>> ... >>>> >>>> Explicit calls are certainly better than initcalls, but that doesn't >>>> seem to be the reason for this patch. >>>> >>>> Does this patch fix a bug? If so, what is the bug? >>> >>> Yes. When the kernel booting, the console logs many times from firmware >>> before GHES drivers init: >>> >>> Trip in MM PCIe RAS handle(Intr:910) >>> Clean PE[1.1.1] ERR_STS:0x4000100 -> 0 INT_STS:F0000000 >>> Find RP(98:1.0) >>> --Walk dev(98:1.0) CE:0 UCE:4000 >>> ... >>> ERROR: sdei_dispatch_event(32a) ret:-1 >>> --handler(910) end >>> >>> This is because the callback function has not been registered yet. >>> Previously reported errors will be overwritten by new ones. Therefore, >>> all errors that occurred before GHES initialization are missed >>> and there is no chance to report and find them again. >>> >>>> You say that currently "errors that occur before GHES initialization >>>> are missed". Isn't that still true after this patch? Does this patch >>>> merely reduce the time before GHES initialization? If so, I'm >>>> dubious, because we have to tolerate an arbitrary amount of time >>>> there. >>> >>> After this patch, there are still errors missing. As you mentioned, >>> we have to tolerate it until the software reporting capability is built. >>> >>> Yes, this patch merely reduce the time before GHES initialization. >> >> It's not a bug that errors that happen before the callback are lost. >> At least, it's not a *Linux* bug. It might be a poor design of the >> firmware error reporting interface. >> >> If the only point of this patch is to reduce the time before GHES >> initialization, the commit log should clearly say that. > > Yep, it is a design defect of firmware. I will explicitly document > the purpose of this patch in next version. > >>> The boot dmesg before this patch: >>> >>> [ 3.688586] HEST: Table parsing has been initialized. >>> ... >>> [ 33.204340] calling sdei_init+0x0/0x120 @ 1 >>> [ 33.208645] sdei: SDEIv1.0 (0x0) detected in firmware. >>> ... >>> [ 36.005390] calling ghes_init+0x0/0x11c @ 1 >>> [ 36.190021] GHES: APEI firmware first mode is enabled by APEI bit and WHEA _OSC. >>> >>> >>> After this patch, the boot dmesg like bellow: >>> >>> [ 3.688664] HEST: Table parsing has been initialized. >>> [ 3.688691] sdei: SDEIv1.0 (0x0) detected in firmware. >>> [ 3.694557] GHES: APEI firmware first mode is enabled by APEI bit and WHEA _OSC. > > After this patch, we use explicit call to init GHES rather than initcall. > The GHES message here is just to prove that GHES initialization is > advanced to 3.694557 seconds. > > >> [Tangent: I think this GHES message is confusing. What "APEI bit" >> does this refer to? The only bits I remember are the Flags bits in >> HEST Error Source Descriptor Entries, e.g., ACPI v6.3, sec 18.3.2. > > From the log code behavior, the APEI bit refers to OSC_SB_APEI_SUPPORT. > (BIT 4, ACPI v6.4, sec 6.2.11.2, table Table 6.13 Platform-Wide _OSC > Capabilities DWORD 2. [1]) > > /* code in drivers/acpi/apei/ghes.c */ > if (rc == 0 && osc_sb_apei_support_acked) > pr_info(GHES_PFX "APEI firmware first mode is enabled by APEI bit and WHEA _OSC.\n"); > > /* code in drivers/acpi/bus.c */ > osc_sb_apei_support_acked = > capbuf_ret[OSC_SUPPORT_DWORD] & OSC_SB_APEI_SUPPORT; > > >> "WHEA _OSC" means nothing to me, and I didn't find anything useful >> with grep, other than that "WHEA" might be an obsolete name for what >> we now call "APEI". > > Yep, WHEA is an obsolete name. I think apei_osc_setup() is added for compatibility > in commit eccddd32ced0 ("ACPI, APEI, Add APEI bit support in generic _OSC call")' > > The ACPI spec Platform-Wide OSPM Capabilities defines > UUID "0811B06E-4A27-44F9-8D60-3CBBC22E7B48" is used in > acpi_bus_osc_support(). The Microsoft WHEA defines > UUID "ed855e0c-6c90-47bf-a62a-26de0fc5ad5c" that is used > in apei_osc_setup(). [2][3] > > From the discussion[4], the GHES message is to distinguish between > the two specs. > > To avoid confusion, can we change the message as follow. > > - generic _OSC succeeded, APEI _OSC failed: APEI firmware first mode is > enabled by APEI bit. > - generic _OSC failed, APEI _OSC succeeded: APEI firmware first mode is > enabled by WHEA _OSC. > - both succeeded: APEI firmware first mode is enabled by APEI bit and > WHEA _OSC. > - both failed: Failed to enable APEI firmware first mode! > > >> I don't think there's anything in _OSC that mentions "firmware first." >> >> I don't remember anything in the spec about a way to *enable* Firmware >> First Error Handling needing (I'm looking at ACPI v6.3, sec 18.4). >> >> I think the "firmware first" information is useless to the OS -- as >> far as I can tell, the spec says nothing about anything the OS should >> do based on the FIRMWARE_FIRST bits.] > > GHES works in "Firmware First" mode to report platform hardware error > and it depends on APEI interface. (drivers/acpi/apei/ghes.c) > > * Generic Hardware Error Source provides a way to report platform > * hardware errors (such as that from chipset). It works in so called > * "Firmware First" mode, that is, hardware errors are reported to > * firmware firstly, then reported to Linux by firmware. > > Based on above, I think the GHES message just means that this booting > kernel supports APEI or WHEA, and GHES report Firmware first errors on > this machine. Is it fine to keep the message? > > >>> As we can see, the initialization of GHES is advanced by 33 seconds. >>> So, in my opinion, this patch is necessary, right? >>> (It should be noted that the effect of optimization varies with the platform.) >> >>>>> -device_initcall(ghes_init); >>>> >>>>> void __init acpi_pci_root_init(void) >>>>> { >>>>> acpi_hest_init(); >>>>> + sdei_init(); >>>>> + ghes_init(); >>>> >>>> What's the connection between PCI, SDEI, and GHES? As far as I can >>>> tell, SDEI and GHES are not PCI-specific, so it doesn't seem like they >>>> should be initialized here in acpi_pci_root_init(). >>> >>> The only reason is that acpi_hest_init() is initialized here. >>> >>> From commit e147133a42cb ("ACPI / APEI: Make hest.c manage the estatus >>> memory pool") was merged, ghes_init() relies on acpi_hest_init() to manage >>> the estatus memory pool. On the other hand, ghes_init() relies on >>> sdei_init() to detect the SDEI version and the framework for registering >>> and unregistering events. The dependencies are as follows >>> >>> ghes_init() => acpi_hest_init() >>> ghes_init() => sdei_init() >>> >>> I don't figure out why acpi_hest_init() is called in >>> acpi_pci_root_init(), it don't rely on any other thing. >>> I am wondering that should we moved all of them further? e.g. >>> following acpi_iort_init() in acpi_init(). >> >> I don't know why acpi_hest_init() is called from acpi_pci_root_init(). >> It looks like HEST can support error sources other than PCI (IA-32 >> Machine Checks, NMIs, GHES, etc.) It was added by 415e12b23792 >> ("PCI/ACPI: Request _OSC control once for each root bridge (v3)"); >> maybe Rafael remembers why. >> >> Seem like acpi_hest_init(), sdei_init(), and ghes_init() should all go >> somewhere else, but I don't know where. Maybe somewhere in >> acpi_init()? > > I tried to move them into acpi_init(), and it works well. > > @@ -1251,6 +1251,9 @@ static int __init acpi_init(void) > > pci_mmcfg_late_init(); > acpi_iort_init(); > + acpi_hest_init(); > + sdei_init(); > + ghes_init(); > acpi_scan_init(); > acpi_ec_init(); > > What's your opinion? > > Best Regards, > > Shuai > > > [1] https://uefi.org/specs/ACPI/6.4/06_Device_Configuration/Device_Configuration.html#platform-wide-osc-capabilities-dword-2 > [2] https://www.mail-archive.com/ubuntu-bugs@xxxxxxxxxxxxxxxx/msg4718503.html > [3] http://www.guyreams.com/PDF/whea.pdf > [4] https://patchwork.kernel.org/project/linux-acpi/patch/1308640587-24502-5-git-send-email-ying.huang@xxxxxxxxx/#1978922