On Mon, Nov 22, 2021 at 2:02 AM Pekka Paalanen <ppaalanen@xxxxxxxxx> wrote: > > On Fri, 19 Nov 2021 11:21:36 -0500 > Jason Baron <jbaron@xxxxxxxxxx> wrote: > > > On 11/18/21 10:24 AM, Pekka Paalanen wrote: > > > On Thu, 18 Nov 2021 09:29:27 -0500 > > > Jason Baron <jbaron@xxxxxxxxxx> wrote: > > > > > >> On 11/16/21 3:46 AM, Pekka Paalanen wrote: > > >>> On Fri, 12 Nov 2021 10:08:41 -0500 > > >>> Jason Baron <jbaron@xxxxxxxxxx> wrote: > > >>> > > >>>> On 11/12/21 6:49 AM, Vincent Whitchurch wrote: > > >>>>> On Thu, Nov 11, 2021 at 03:02:04PM -0700, Jim Cromie wrote: > > >>>>>> Sean Paul proposed, in: > > >>>>>> https://urldefense.com/v3/__https://patchwork.freedesktop.org/series/78133/__;!!GjvTz_vk!HcKnMRByYkIdyF1apqQjlN5aBIomzJR1an3YWXM6KXs0EftVMQdrewRA8Dki4A$ > > >>>>>> drm/trace: Mirror DRM debug logs to tracefs > > >>>>>> > > >>>>>> His patchset's objective is to be able to independently steer some of > > >>>>>> the drm.debug stream to an alternate tracing destination, by splitting > > >>>>>> drm_debug_enabled() into syslog & trace flavors, and enabling them > > >>>>>> separately. 2 advantages were identified: > > >>>>>> > > >>>>>> 1- syslog is heavyweight, tracefs is much lighter > > >>>>>> 2- separate selection of enabled categories means less traffic > > >>>>>> > > >>>>>> Dynamic-Debug can do 2nd exceedingly well: > > >>>>>> > > >>>>>> A- all work is behind jump-label's NOOP, zero off cost. > > >>>>>> B- exact site selectivity, precisely the useful traffic. > > >>>>>> can tailor enabled set interactively, at shell. > > >>>>>> > > >>>>>> Since the tracefs interface is effective for drm (the threads suggest > > >>>>>> so), adding that interface to dynamic-debug has real potential for > > >>>>>> everyone including drm. > > >>>>>> > > >>>>>> if CONFIG_TRACING: > > >>>>>> > > >>>>>> Grab Sean's trace_init/cleanup code, use it to provide tracefs > > >>>>>> available by default to all pr_debugs. This will likely need some > > >>>>>> further per-module treatment; perhaps something reflecting hierarchy > > >>>>>> of module,file,function,line, maybe with a tuned flattening. > > >>>>>> > > >>>>>> endif CONFIG_TRACING > > >>>>>> > > >>>>>> Add a new +T flag to enable tracing, independent of +p, and add and > > >>>>>> use 3 macros: dyndbg_site_is_enabled/logging/tracing(), to encapsulate > > >>>>>> the flag checks. Existing code treats T like other flags. > > >>>>> > > >>>>> I posted a patchset a while ago to do something very similar, but that > > >>>>> got stalled for some reason and I unfortunately didn't follow it up: > > >>>>> > > >>>>> https://urldefense.com/v3/__https://lore.kernel.org/lkml/20200825153338.17061-1-vincent.whitchurch@xxxxxxxx/__;!!GjvTz_vk!HcKnMRByYkIdyF1apqQjlN5aBIomzJR1an3YWXM6KXs0EftVMQdrewRGytKHPg$ > > >>>>> > > >>>>> A key difference between that patchset and this patch (besides that > > >>>>> small fact that I used +x instead of +T) was that my patchset allowed > > >>>>> the dyndbg trace to be emitted to the main buffer and did not force them > > >>>>> to be in an instance-specific buffer. > > >>>> > > >>>> Yes, I agree I'd prefer that we print here to the 'main' buffer - it > > >>>> seems to keep things simpler and easier to combine the output from > > >>>> different sources as you mentioned. > > >>> > > >>> Hi, > > >>> > > >>> I'm not quite sure I understand this discussion, but I would like to > > >>> remind you all of what Sean's original work is about: > > >>> > > >>> Userspace configures DRM tracing into a flight recorder buffer (I guess > > >>> this is what you refer to "instance-specific buffer"). > > >>> > > >>> Userspace runs happily for months, and then hits a problem: a failure > > >>> in the DRM sub-system most likely, e.g. an ioctl that should never > > >>> fail, failed. Userspace handles that failure by dumping the flight > > >>> recorder buffer into a file and saving or sending a bug report. The > > >>> flight recorder contents give a log of all relevant DRM in-kernel > > >>> actions leading to the unexpected failure to help developers debug it. > > >>> > > >>> I don't mind if one can additionally send the flight recorder stream to > > >>> the main buffer, but I do want the separate flight recorder buffer to > > >>> be an option so that a) unrelated things cannot flood the interesting > > >>> bits out of it, and b) the scope of collected information is relevant. > > >>> > > >>> The very reason for this work is problems that are very difficult to > > >>> reproduce in practice, either because the problem itself is triggered > > >>> very rarely and randomly, or because the end users of the system have > > >>> either no knowledge or no access to reconfigure debug logging and then > > >>> reproduce the problem with good debug logs. > > >>> > > >>> Thank you very much for pushing this work forward! > > >>> > > >>> > > >> > > >> So I think Vincent (earlier in the thread) was saying that he finds it > > >> very helpful have dynamic debug output go to the 'main' trace buffer, > > >> while you seem to be saying you'd prefer it just go to dynamic debug > > >> specific trace buffer. > > > > > > Seems like we have different use cases: traditional debugging, and > > > in-production flight recorder for problem reporting. I'm not surprised > > > if they need different treatment. > > > > > >> So we certainly can have dynamic output potentially go to both places - > > >> although I think this would mean two tracepoints? But I really wonder > > >> if we really need a separate tracing buffer for dynamic debug when > > >> what goes to the 'main' buffer can be controlled and filtered to avoid > > >> your concern around a 'flood'? > > > > > > If the DRM tracing goes into the main buffer, then systems in > > > production cannot have any other sub-system traced in a similar > > > fashion. To me it would feel very arrogant to say that to make use of > > > DRM flight recording, you cannot trace much or anything else. > > > > > > The very purpose of the flight recorder is run in production all the > > > time, not in a special debugging session. > > > > > > There is also the question of access and contents of the trace buffer. > > > Ultimately, if automatic bug reports are enabled in a system, the > > > contents of the trace buffer would be sent as-is to some bug tracking > > > system. If there is a chance to put non-DRM stuff in the trace buffer, > > > that could be a security problem. > > > > > > My use case is Weston. When Weston encounters an unexpected problem in > > > production, something should automatically capture the DRM flight > > > recorder contents and save it alongside the Weston log. Would be really > > > nice if Weston itself could do that, but I suspect it is going to need > > > root privileges so it needs some helper daemon. > > > > > > Maybe Sean can reiterate their use case more? > > > > > > > > > Thanks, > > > pq > > > > > > > Ok, so in this current thread the proposal was to create a "dyndbg-tracefs" > > buffer to put the dynamic debug output (including drm output from dynamic > > debug) into. And I was saying let's just put in the 'main' trace buffer > > (predicated on a dynamic debug specific tracepoint), since there seems > > to be a a use-case for that and it keeps things simpler. > > > > But I went back to Sean's original patch, and it creates a drm specific > > trace buffer "drm" (via trace_array_get_by_name("drm")). Here: > > https://patchwork.freedesktop.org/patch/445549/?series=78133&rev=5 > > > > So I think that may be some of the confusion here? The current thread/ > > proposal is not for a drm specific trace buffer... > > Hi Jason, > > I may very well have confused things, sorry about that. If this series > is not superseding the idea of the DRM flight recorder, then don't mind > me. It just sounded very similar and I also haven't seen new revisions > of the flight recorder in a long time. IMO this series has clarified the requirement for a flight-recorder mode, which seems to fit ideally in a separate instance. > > Having a subsystem specific trace buffer would allow subsystem specific > > trace log permissions depending on the sensitivity of the data. But > > doesn't drm output today go to the system log which is typically world > > readable today? > > Yes, and that is exactly the problem. The DRM debug output is so high > traffic it would make the system log both unusable due to cruft and > slow down the whole machine. The debug output is only useful when > something went wrong, and at that point it is too late to enable > debugging. That's why a flight recorder with an over-written circular > in-memory buffer is needed. Seans patch reuses enum drm_debug_category to split the tracing stream into 10 sub-streams - how much traffic from each ? - are some sub-streams more valuable for post-mortem ? - any value from further refinement of categories ? - drop irrelevant callsites individually to reduce clutter, extend buffer time/space ?