Re: [PATCH v4 3/3] media: atomisp: Only use trace_printk if allowed

Nicolas Boichat <drinkcat@xxxxxxxxxxxx> · Fri, 21 Aug 2020 10:39:02 +0800

On Fri, Aug 21, 2020 at 9:57 AM Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
>
> On Fri, 21 Aug 2020 09:39:19 +0800
> Nicolas Boichat <drinkcat@xxxxxxxxxxxx> wrote:
>
> > On Fri, Aug 21, 2020 at 8:36 AM Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
> > >
> > > On Fri, 21 Aug 2020 08:13:00 +0800
> > > Nicolas Boichat <drinkcat@xxxxxxxxxxxx> wrote:
> > >
> > > > On Thu, Aug 20, 2020 at 10:23 PM Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
> > > > >
> > > > > On Thu, 20 Aug 2020 17:14:12 +0800
> > > > > Nicolas Boichat <drinkcat@xxxxxxxxxxxx> wrote:
> > > > >
> > > > > > Technically, we could only initialize the trace_printk buffers
> > > > > > when the print env is switched, to avoid the build error and
> > > > > > unconditional boot-time warning, but I assume this printing
> > > > > > framework will eventually get removed when the driver moves out
> > > > > > of staging?
> > > > >
> > > > > Perhaps this should be converting into a trace event. Look at what bpf
> > > > > did for their bpf_trace_printk().
> > > > >
> > > > > The more I think about it, the less I like this series.
> > > >
> > > > To make it clear, the primary goal of this series is to get rid of
> > > > trace_printk sprinkled in the kernel by making sure some randconfig
> > > > builds fail. Since my v2, there already has been one more added (the
> > > > one that this patch removes), so I'd like to land 2/3 ASAP to prevent
> > > > even more from being added.
> > > >
> > > > Looking at your reply on 1/3, I think we are aligned on that goal? Is
> > > > there some other approach you'd recommend?
> > > >
> > > > Now, I'm not pretending my fixes are the best possible ones, but I
> > > > would much rather have the burden of converting to trace events on the
> > > > respective driver maintainers. (btw is there a short
> > > > documentation/tutorial that I could link to in these patches, to help
> > > > developers understand what is the recommended way now?)
> > > >
> > >
> > > I like the goal, but I guess I never articulated the problem I have
> > > with the methodology.
> > >
> > > trace_printk() is meant to be a debugging tool. Something that people
> > > can and do sprinkle all over the kernel to help them find a bug in
> > > areas that are called quite often (where printk() is way too slow).
> > >
> > > The last thing I want them to deal with is adding a trace_printk() with
> > > their distro's config (or a config from someone that triggered the bug)
> > > only to have the build to fail, because they also need to add a config
> > > value.
> > >
> > > I add to the Cc a few developers I know that use trace_printk() in this
> > > fashion. I'd like to hear their view on having to add a config option
> > > to make trace_printk work before they test a config that is sent to
> > > them.
> >
> > Gotcha, thanks. I have also used trace_printk in the past, as
> > uncommitted changes (and understand the usefulness ,-)). And in Chrome
> > OS team here, developers have also raised this concern: how do we make
> > the developer flow convenient so that we can add trace_printk to our
> > code for debugging, without having to flip back that config option,
> > and _yet_ make sure that no trace_printk ever makes it into our
> > production kernels. We have creative ways of making that work (portage
> > USE flags and stuff). But I'm not sure about other flows, and your
> > concern is totally valid...
> >
> > Some other approaches/ideas:
> >  1. Filter all lkml messages that contain trace_printk. Already found
> > 1 instance, and I can easily reply to those with a semi-canned answer,
> > if I remember to check that filter regularly (not sustainable in the
> > long run...).
>
> Added Joe Perches to the thread.
>
> We can update checkpatch.pl to complain about a trace_printk() that it
> finds in the added code.

Oh, that's a good and simple idea.

>
> >  2. Integration into some kernel test robot? (I will not roll my own
> > for this ,-)) It may be a bit difficult as some debug config options
> > do enable trace_printk, and that's ok.
> >  3. In Chromium OS, I can add a unit test (i.e. something outside of
> > the normal kernel build system), but that'll only catch regressions
> > downstream (or when we happen to backport patches).
> >
> > Down the line, #3 catches what I care about the most (Chromium OS
> > issues: we had production kernels for a few days/weeks showing that
> > splat on boot), but it'd be nice to have something upstream that
> > benefits everyone.
> >
>
> What about an opposite config. That is, not have a config to enable it.
> But one to disable it. If it is disabled and a trace_printk is found,
> it will fail the build. This way your builds will not allow your kernel
> to get out the door with one.
>
> #ifdef CONFIG_DISABLE_TRACE_PRINTK
> #define trace_printk    __this_function_is_disabled
> #endif

I'm not sure how that helps? I mean, the use case you have in mind is
somebody reusing a distro/random config and not being able to use
trace_printk, right? If that config has CONFIG_DISABLE_TRACE_PRINTK=y,
then the developer will still need to flip that back.

Note that the option I'm added has default=y (_allow_ trace_printk),
so I don't think default y or default n really matters?

>
> ?
>
> -- Steve