Re: [PATCH v1] dynamic_debug: add support for logs destination

Pekka Paalanen <ppaalanen@xxxxxxxxx> · Wed, 11 Oct 2023 11:48:16 +0300

On Tue, 10 Oct 2023 10:06:02 -0600
jim.cromie@xxxxxxxxx wrote:

> since I name-dropped you all,

Hi everyone,

I'm really happy to see this topic being developed! I've practically
forgot about it myself, but the need for it has not diminished at all.

I didn't understand much of the conversation, so I'll just reiterate
what I would use it for, as a Wayland compositor developer.

I added a few more cc's to get better coverage of DRM and Wayland
compositor developers.

> On Tue, Oct 10, 2023 at 10:01 AM <jim.cromie@xxxxxxxxx> wrote:
> >
> > On Mon, Oct 9, 2023 at 4:47 PM Łukasz Bartosik <lb@xxxxxxxxxxxx> wrote:  

...

> > > I don't have a real life use case to configure different trace
> > > instance for each callsite.
> > > I just tried to be as much flexible as possible.
> > >  
> >
> > Ive come around to agree - I looked back at some old threads
> > (that I was a part of, and barely remembered :-}
> >
> > At least Sean Paul, Lyude, Simon Ser, Pekka Paalanen
> > have expressed a desire for a "flight-recorder"
> > it'd be hard to say now that 2-3 such buffers would always be enough,
> > esp as theres a performance reason for having your own.

A Wayland compositor has roughly three important things where the kernel
debugs might come in handy:
- input
- DRM KMS
- DRM GPU rendering

DRM KMS is the one I've been thinking of in the flight recorder context
the most, because KMS hardware varies a lot, and there is plenty of
room for both KMS drivers and KMS userspace to go wrong. The usual
result is your display doesn't work, so the system is practically
unusable to the end user. In the wild, the simplest or maybe the only
way out of that may be a reboot, maybe an automated one (e.g. digital
signage). In order to debug such problems, we would need both
compositor logs and the relevant kernel debug messages.

For example, Weston already has a flight recorder framework of its own,
so we have the compositor debug logs. It would be useful to get the
selected kernel debug logs in the same place. It could be used for
automated or semi-manual bug reporting, for example, making the
administrator or end user life much easier reporting issues.

Since this is usually a production environment, and the Wayland
compositor runs without root privileges, we need something that works
with that. We would likely want the kernel debug messages in the
compositor to combine and order them properly with the compositor debug
messages.

It's quite likely that developers would like to pick and choose which
kernel debug messages might be interesting enough to record, to avoid
excessive log flooding. The flight recorder in Weston is fixed size to
avoid running out of memory or disk space. I can also see that Weston
could have debugging options that affect which kernel debug messages it
subscribes to. We can have a reasonable default setup that allows us to
pinpoint the problem area and figure out most problems, and if needed,
we could ask the administrator pass another debug option to Weston. It
helps if there is just one place to configure everything about the
compositor.

This implies that it would be really nice to have userspace subscriber
specific debug message streams from the kernel, or a good way to filter
the messages we want. A Wayland compositor would not be interested in
file system or wireless debugs for example, but another system
component might be. There is also a security aspect of which component is
allowed to see which messages in case they could contain anything
sensitive (input debug could contain typed passwords).

Configuring the kernel debug message selection for our debug message
stream can and probably should require elevated privileges, and we can
likely solve that in userspace with a daemon or such, to allow the
Wayland compositor to run as a regular user.

Thinking of desktop systems, and especially physically multi-seat systems:
- there can be multiple different Wayland compositors running simultaneously
- each of them may want debug messages only from a specific DRM KMS
  device instance, and not from others
- each of them may have a different idea of which debug messages are important
- because DRM KMS leasing is a thing, different compositor instances
  could be using the same DRM KMS device instance simultaneously; since
  this is specific to DRM KMS, and it should be harmless to get a
  little too much DRM KMS debug (that is, from the whole device instead
  of just the leased parts), it may not be worth to consider splitting
  debug message streams this far.

If userspace is offered some standardised fields in kernel debug
structures, then userspace could do some filtering on its own too, but I
guess it would be better to filter at the source and not need that.

There is also an anti-goal. The kernel debug message contents are
specifically not machine-parsable. I very much do not want to impose
debug strings as ABI, they are for human (and AI?) readers only.

As a summary, here are the most important requirements first:
- usable in production as a normal thing to enable always by default
- final delivery to unprivileged userspace process
- per debug-print selection of messages (finer or coarser, categories
  within a kernel sub-system could be enough)
- per originating device (driver instance) selection of messages
- all selections tailored separately for each userspace subscriber
(- per open device file description selection of messages)

That's my idea of it. It is interesting to see how far the requirements
can be reasonably realised.

Thanks,
pq
Attachment:
pgpfzHBFw0ZAI.pgp

Description: OpenPGP digital signature