Re: [PATCH 2/6] tracing: Add trace event error log

Masami Hiramatsu <mhiramat@xxxxxxxxxx> · Sat, 14 Apr 2018 15:31:03 +0900

On Thu, 12 Apr 2018 18:52:13 -0500
Tom Zanussi <tom.zanussi@xxxxxxxxxxxxxxx> wrote:

> Hi Steve,
> 
> On Thu, 2018-04-12 at 18:20 -0400, Steven Rostedt wrote:
> > On Thu, 12 Apr 2018 10:13:17 -0500
> > Tom Zanussi <tom.zanussi@xxxxxxxxxxxxxxx> wrote:
> > 
> > > diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
> > > index 6fb46a0..f2dc7e6 100644
> > > --- a/kernel/trace/trace.h
> > > +++ b/kernel/trace/trace.h
> > > @@ -1765,6 +1765,9 @@ extern ssize_t trace_parse_run_command(struct file *file,
> > >  		const char __user *buffer, size_t count, loff_t *ppos,
> > >  		int (*createfn)(int, char**));
> > >  
> > > +extern void event_log_err(const char *loc, const char *cmd, const char *fmt,
> > > +			  ...);
> > > +
> > >  /*
> > >   * Normal trace_printk() and friends allocates special buffers
> > >   * to do the manipulation, as well as saves the print formats
> > > diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
> > > index 05c7172..fd02e22 100644
> > > --- a/kernel/trace/trace_events.c
> > > +++ b/kernel/trace/trace_events.c
> > > @@ -1668,6 +1668,164 @@ static void ignore_task_cpu(void *data)
> > >  	return ret;
> > >  }
> > >  
> > > +#define EVENT_LOG_ERRS_MAX	(PAGE_SIZE / sizeof(struct event_log_err))
> > 
> > > +#define EVENT_ERR_LOG_MASK	(EVENT_LOG_ERRS_MAX - 1)
> > 
> > BTW, the above only works if EVENT_LOG_ERRS_MAX is a power of two,
> > which it's not guaranteed to be.
> > 
> 
> My assumption was that we'd only ever need a page or two for the
> error_log and so would always would be a power of two, since the size of
> the struct event_log_err is 512.
> 
> Anyway, I should probably have put comments about all this in the code,
> and I will, but the way it works kind of assumes a very small number of
> errors - it's replacing a simple 'last error' facility for the hist
> triggers and making it a common facility for other things that have
> similar needs like Masami's kprobe_events errors.  For those purposes, I
> assumed it would suffice to simply be able to show that last 8 or some
> similar small number of errors and constantly recycle the slots.

Correct. I don't think the error log entry size over 16, it is too much
errors occur. User must check it before that. Or, we should push it
printk buffer.

> 
> Basically it just splits the page into 16 strings, 2 per error, one for
> the actual error text, the other for the command the user entered.  The
> struct event_log_err just overlays a struct on top of 2 strings just to
> make it easier to manage.
> 
> Anyway, because it is such a small number, and we start with a zeroed
> page, whenever we print the error log, we print all 16 strings even if
> we only have one error (2 strings).  The rest are NULL and print
> nothing.  We start with the tail, which could also be thought of as the
> 'oldest' or the 'first' error in the buffer and just cycle through them
> all.  Hope that clears up some of the other questions you had about how
> a non-full log gets printed, etc... 
> 
> > > +
> > > +struct event_log_err {
> > > +	char			err[MAX_FILTER_STR_VAL];
> > > +	char			cmd[MAX_FILTER_STR_VAL];
> > > +};
> > 
> > I like the event_log_err idea, but the above can be shrunk to:
> > 
> > struct err_info {
> > 	u8	type; /* I can only imagine 254 types */
> > 	u8	pos;  /* MAX_FILTER_STR_VAR = 256 */
> > };
> > 
> > struct event_log_err {
> > 	struct err_info		info;
> > 	char			cmd[MAX_FILTER_STR_VAL];
> > };
> > 
> > There's no reason to put in a bunch of text that's going to be static
> > anyway. Have a lookup table like we do for filters.
> > 
> > +				log_err("Variable name not unique, need to use fully qualified name (%s) for variable: ", fqvar(system, event_name, var_name, true));
> > 
> 
> Hmm, most of the log_errs use printf strings that get expanded, so need
> a destination buffer, the event_log_err->err string, but I think I see
> what you're getting at - that we can get rid of the format strings
> altogether and make them static strings if we use the method of simply
> printing the static string and putting a caret where the error is as
> below.
> 
> > 
> > Instead of making the fqvar, find the location of the variable, and add:
> > 
> >  blah blah $var blah blah
> >             ^
> >   Variable name not unique, need to use fully qualified name for variable
> > 
> 
> OK, if we can do this for every error type, then we can use the lookup
> table and the caret position instead of format strings.  Which means
> getting rid of the simple ring of strings, but whatever..

>From the viewpoint of kprobe events, I'm OK for either way.
I'll rewrite parser to get parsing position correctly.

Thanks!

-- 
Masami Hiramatsu <mhiramat@xxxxxxxxxx>
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html