Re: [RFC PATCH v2 3/3] ACPI: extlog: Make print_extlog_rcd() log unconditionally

Dan Williams <dan.j.williams@xxxxxxxxx> · Thu, 16 May 2024 11:56:15 -0700

[ add Steven for a "deprecate emitting hardware errors with explicit printk" discussion ]

Borislav Petkov wrote:
> On Sun, May 12, 2024 at 04:45:08PM -0700, Dan Williams wrote:
> > Yes, my point though was that if it got deleted I doubt anyone would
> > notice. rasdaemon explicitly does not check the return from
> > open("daemon_active").
> 
> The intent was for userspace to open it and thus it'll increment
> trace_count which then ras_userspace_consumers() reads...
> 
> > I am also curious about the history here. This "daemon_active" scheme is
> > an awkward way to detect that something is consuming the tracepoint. It
> > was added on v4.0, but Steven had added "tracepoint_enabled()" back in
> > v3.17:
> > 
> > 7c65bbc7dcfa tracing: Add trace_<tracepoint>_enabled() function
> 
> Ha, I usually talk to Rostedt for all things tracepoint when wondering
> how we could use them for RAS purposes but I haven't this time, it
> seems.
> 
> > So even if non-rasdaemon userspace was watching the extlog tracepoints
> > they would not fire because ras_userspace_consumers() prevents it.
> >
> > I am finding it difficult to see why ras_userspace_consumers() needs to
> > continue to be maintained.
> 
> Well, you still need some functionality which tests whether a userspace
> daemon consumes RAS events. Whether it is something cludgy like now or
> something which checks whether all RAS tracepoints have been enabled,
> something's gotta be there.

Honest question, why? Lets deprecate that path. As an example, this
extlog path has bit-rotted and not kept pace with the same level of
error reporting that is available via ACPI GHES or OS native
tracepoints.

Given that is has not kept pace the next question is whether the kernel
should bother to maintain the contract => "if nothing is watching
tracepoints some subset (all?) hardware error messages will be reflected
to the kernel log".

I would point to tp_printk as a way to get tracepoints into the kernel
log. If that is too coarse-grained a replacement for print_extlog_rcd()
I would advocate spending time making tp_printk control more
fine-grained rather than perpetuate this duplicated emission path for
error info.

Something like a new annotation for tracepoints marking them as "emit to
kernel log if no-one is watching this high-priority trace event"?

> > That would be odd since there is no ras_userspace_consumers() in the
> > ACPI GHES path,
> 
> Probably because no one's using RAS daemon with GHES. I at least haven't
> heard of anyone complaining about this yet...

Well no, there is little to complain about in that path because that
path does not play "is anybody watching" games. It always emits to the
kernel log (subject to rate limiting) and then follows up with always
emitting a tracepoint (subject to standard trace filtering).

For CXL I asked that its events deprecate the kernel log path with the
hope of not growing new userspace dependencies on kernel log parsing for
newfangled CXL errors.

[..]
> > Would be great to hear from folks that have a reasons for kernel log
> > message error reporting to continue.
> 
> Right, from my experience so far, you never hear anything. :-\
> 
> So if we do anything, it should be something simple and which works for
> almost everyone.
> 
> With RAS, everyone does their own thing. And then there's the firmware
> which claims that it can do better RAS but then f*cks up on basic things
> like *actually* shipping a working EINJ or whatever implementation.

*sad nod*

> So in the end of the day it is, oh, we need our drivers in the OS
> because we can't fix firmware. It is harder to fix it than *hardware*
> :-P

At least when the firmware gets out of the way Linux has a chance to
solve user issues.

> > Uniformity of error response to "fatal" events, but that is mainly a
> > PCIe error handling concern not  CPU errors.
> 
> Sure, just make sure to keep it simple and generic.

Yes, tracepoints feel simple and generic to me while kernel log messages
with rate-limiting is already a lossy proposition.