Fabio M. De Francesco wrote: > On Tuesday, August 6, 2024 9:56:24 PM GMT+2 Dan Williams wrote: > > Fabio M. De Francesco wrote: > > > Currently, extlog_print() (ELOG) only reports CPER PCIe section (UEFI > > > v2.10, Appendix N.2.7) to the kernel log via print_extlog_rcd(). > > > > I think the critical detail is is that print_extlog_rcd() is only > > triggered when ras_userspace_consumers() returns true. The observation > > is that ras_userspace_consumers() hides information from the trace path > > when the intended purpose of it was to hide duplicate emissions to the > > kernel log when userspace is watching the tracepoints. > > > > Setting aside whether ras_userspace_consumers() is still a good idea or > > not, it is obvious that this patch as is may surprise environments that > > start seeing kernel error logs where the kernel was silent before. > > > > I think the path of least surprise would be to make sure that > > pci_print_aer() optionally skips emitting to the kernel log when not > > needed wanted. > > Sorry for replying so late... > > I'm not entirely sure that users would not prefer to be surprised by > _finally_ seeing kernel error logs for failing PCIe components. I suspect > that users might have been confused by not seeing any output. 2 notes: * New KERN_ERR prints are often found to be unwelcome. When the kernel starts printing new error messages it causes sysadmins to scramble. * The future of RAS is trace-events. Any new RAS messages to the kernel log need to ask the question, "is userspace better served by registering for a RAS trace event, rather than parsing kernel log messsages". [..] > I need to be sure that I understood... > > void pci_print_aer(char *level, struct pci_dev *dev, int aer_severity, > struct aer_capability_regs *aer) > { > [...] > > if (printk_get_level(level) <= console_loglevel) { > pci_err(dev, "aer_status: 0x%08x, aer_mask: 0x%08x\n", > status, mask); No, the code would be: pci_printk(level, dev, "aer_status: 0x%08x, aer_mask: 0x%08x\n", status, mask); ...i.e. just pass @level rather than open code "if (printk_get_level(level) <= console_loglevel)".