On Thu, Sep 28, 2023 at 10:42:39AM +0800, Yuanhe Shu wrote: > In public cloud scenario, if kdump service works abnormally, > users cannot get vmcore. Without vmcore, user has no idea why the > kernel crashed. Meanwhile, there is no additional information > to find the reason why the kdump service is abnormal. > > One way is to obtain console messages through VNC. The drawback > is that VNC is real-time, if user missed the timing to get the VNC > output, the crash needs to be retriggered. > > Another way is to enable the console frontend of pstore and record the > console messages to the pstore backend. On the one hand, the console > logs only contain kernel printk logs and does not cover > user-mode print logs. Although we can redirect user-mode logs to the > pmsg frontend provided by pstore, user-mode information related to > booting and kdump service vary from systemd, kdump.sh, and so on which > makes redirection troublesome. So we added a tty frontend and save all > logs of tty driver to the pstore backend. This is a clever solution! > Another problem is that currently pstore only supports a single backend. > For debugging kdump problems, we hope to save the console logs and tty > logs to the ramoops backend of pstore, as it will not be lost after > rebooting. If the user has enabled another backend, the ramoops backend > will not be registered. To this end, we add the multi-backend function > to support simultaneous registration of multiple backends. Ah very cool; I really like this idea. I'd wanted to do it for a while just to make testing easier, but I hadn't had time to attempt it. > Based on the above changes, we can enable pstore in the crashdump kernel > and save the console logs and tty logs to the ramoops backend of pstore. > After rebooting, we can view the relevant logs by mounting the pstore > file system. So, before I do a line-at-a-time review of this code, I'd like to address some design issues first. I really don't want to make behavioral differences when we don't have to: - The multi-backend will enable _all possible_ backends, and that's a big change that will do weird things for some pstore users. I would prefer a pstore option to opt-in to enabling all backends. Perhaps have "pstore.backend=" be parsed with commas, so a list of backends can be provided, or "all" for the "all backends" behavior. - Moving the pstorefs files into a subdirectory will break userspace immediately (e.g. systemd-pstore expects very specifically named files). Using subdirectories seems like a good idea, but perhaps we need hardlinks into the root pstorefs for the "first" backend, or some other creative solution here. Then some technical thoughts about the TTY frontend's behavior: - That 2 pstore records are created for every line of TTY output feels kind of inefficient, though I don't have a better idea. This is really only doable as you have it because the ramoops and zone backends treat the single prz as a circular buffer. I wonder about supporting this on other backends like EFI, but perhaps it's just not going to happen. - I'd like to check with the TTY folks to see if this is the "right" place to hook to get a copy of what's being written. Thanks and let me know what you think! -Kees -- Kees Cook