On Mon, Aug 21, 2023 at 7:35 PM Limonciello, Mario <mario.limonciello@xxxxxxx> wrote: > > > > On 8/21/2023 12:29 PM, Rafael J. Wysocki wrote: > > On Mon, Aug 21, 2023 at 7:17 PM Limonciello, Mario > > <mario.limonciello@xxxxxxx> wrote: > >> > >> On 8/21/2023 12:12 PM, Rafael J. Wysocki wrote: > >> <snip> > >>>> I was just talking to some colleagues about PHAT recently as well. > >>>> > >>>> The use case that jumps out is "system randomly rebooted while I was > >>>> doing XYZ". You don't know what happened, but you keep using your > >>>> system. Then it happens again. > >>>> > >>>> If the reason for the random reboot is captured to dmesg you can cross > >>>> reference your journal from the next boot after any random reboot and > >>>> get the reason for it. If a user reports this to a Gitlab issue tracker > >>>> or Bugzilla it can be helpful in establishing a pattern. > >>>> > >>>>>> The below location may be appropriate in that case: > >>>>>> /sys/firmware/acpi/ > >>>>> > >>>>> Yes, it may. > > >>>>>> We already have FPDT and BGRT being exported from there. > >>>>> > >>>>> In fact, all of the ACPI tables can be retrieved verbatim from > >>>>> /sys/firmware/acpi/tables/ already, so why exactly do you want the > >>>>> kernel to parse PHAT in particular? > >>>>> > >>>> > >>>> It's not to say that /sys/firmware/acpi/PHAT isn't useful, but having > >>>> something internal to the kernel "automatically" parsing it and saving > >>>> information to a place like the kernel log that is already captured by > >>>> existing userspace tools I think is "more" useful. > >>> > >>> What existing user space tools do you mean? Is there anything already > >>> making use of the kernel's PHAT output? > >>> > >> > >> I was meaning things like systemd already capture the kernel long > >> ringbuffer. If you save stuff like this into the kernel log, it's going > >> to be indexed and easier to grep for boots that had it. > >> > >>> And why can't user space simply parse PHAT by itself? > >>> > There are multiple ACPI tables that could be dumped into the kernel > >>> log, but they aren't. Guess why. > >> > >> Right; there's not reason it can't be done by userspace directly. > >> > >> Another way to approach this problem could be to modify tools that > >> excavate records from a reboot to also get PHAT. For example > >> systemd-pstore will get any kernel panics from the previous boot from > >> the EFI pstore and put them into /var/lib/systemd/pstore. > >> > >> No reason that couldn't be done automatically for PHAT too. > > > > I'm not sure about the connection between the PHAT dump in the kernel > > log and pstore. > > > > The PHAT dump would be from the time before the failure, so it is > > unclear to me how useful it can be for diagnosing it. However, after > > a reboot one should be able to retrieve PHAT data from the table > > directly and that may include some information regarding the failure. > > Right so the thought is that at bootup you get the last entry from PHAT > and save that into the log. > > Let's say you have 3 boots: > X - Triggered a random reboot > Y - Cleanly shut down > Z - Boot after a clean shut down > > So on boot Y you would have in your logs the reason that boot X rebooted. Yes, and the same can be retrieved from the PHAT directly from user space at that time, can't it? > On boot Z you would see something about how boot Y's reason. > > > > > With pstore, the assumption is that there will be some information > > relevant for diagnosing the failure in the kernel buffer, but I'm not > > sure how the PHAT dump from before the failure can help here? > > Alone it's not useful. > I had figured if you can put it together with other data it's useful. > For example if you had some thermal data in the logs showing which > component overheated or if you looked at pstore and found a NULL pointer > dereference. IIUC, the current PHAT content can be useful. The PHAT content from boot X (before the failure) which is what will be there in pstore after the random reboot, is of limited value AFAICS.