On Mon, Oct 29, 2018 at 03:39:51PM +0000, Ben Dooks wrote: > The variadic argumnet code did not check any of the variadic arguments > as it did not previously know the possible type. Now we have the possible > formatting information stored in the ctype, we can do some checks on the > printf formatting types. > > Signed-off-by: Ben Dooks <ben.dooks@xxxxxxxxxxxxxxx> > --- > Fixes since v1: > - Split out the format-string -> symbol code > - Use symbol_list for the symbols from format parsing > - Changed to follow the new parsing code and ctype use > - Merged the unsigned-int/long types together > > Notes: > - Is there a way of better doing the vararg list? I'm not totally sure to understand but retireving a sym from the variadic_types can/should be: * done with an helper giving the nth element from a list (I think I've such an helper somewhere in an unposted patch series, I'll look after it). * more simply done with an iteration: PREPARE_PTR_LIST(variadic_types, vtype); ... if (i >= fn->ctype.printf_va_start) { target = vtype; NEXT_PTR_LIST(vtype); } > - %p still generates an address-space mismatch Yes. See my comment in patch 5/5. I really thing you will need to either special-case it or more probably store into 'result' some checking function instead of simply storing a type (see nex point). > - how do we deal with the kernel's attempt to make printk format all types? In evaluate_printf_symbol(), instead of having a simple switch testing 'the' conversion character, I think it will be easier and much more flexible to use a table which will maps the conversion string (here only 1-char string but ...) into a function pointer that will make the check. The string at entry will allow to use more complex conversion format (like kernel's "%pI4b", for example). The checking function will allow to make real type check. The problem is that as it is now, the checking is done by compatible_argument_type() by this function is much more permissive than you wish (for example, a 'long' is compatible with an 'int' and thus won't give you any warning although they have a different size (on 64 bit) which is the most important thing to check in the variadic checks). Same for "%p" and noderef, like you have discovered. > --- > evaluate.c | 165 +++++++++++++++++++++++++++++++++++++++++++++++++++-- > parse.c | 2 +- > 2 files changed, 162 insertions(+), 5 deletions(-) > > diff --git a/evaluate.c b/evaluate.c > index b96696d..2a98a9b 100644 > --- a/evaluate.c > +++ b/evaluate.c > @@ -2243,23 +2243,180 @@ static struct symbol *evaluate_alignof(struct expression *expr) > return size_t_ctype; > } > > +struct printf_state { > + int len; /* size of the argument (ie, %d, %ld, %lld, etc.) */ > +}; > + > +static struct symbol *evaluate_printf_symbol(const char *string, struct printf_state *state) In sparse 'evaluation' is really 'type evaluation'. I would prefer that you name this function something like 'parse_printf_conversion/format()'. > +static int decompose_format_printf(const char *string, struct symbol_list **result) > +{ > + struct printf_state state; > + int count = 0; > + > + /* TODO - deal with explitic position arguments */ > + for (; string[0] != '\0'; string++) { > + struct symbol *sym; > + > + if (string[0] != '%') > + continue; > + if (string[1] == '%') { > + string++; > + continue; > + } > + > + state.len = 0; > + > + /* get rid of any formatting width bits */ > + while (isdigit(string[1]) || string[1] == '+' || string[1] == '-') > + string++; > + > + sym = evaluate_printf_symbol(string+1, &state); > + if (sym) { > + add_symbol(result, sym); > + count++; > + } > + > + while (string[0] != ' ' && string[0] != '\0') > + string++; > + > + string--; > + } Unless you have some further plan for struct printf_state, the variable 'state' is really something local to evaluate_printf_symbol() and should be moved tere. Likewise, 'sym' is not really neeed here as you can pass 'result' directly to evaluate_printf_symbol() and add the symbol there. And if you then let evaluate_printf_symbol() return the next position of the string you can simplify part of the string handling here. > @@ -2281,11 +2438,11 @@ static int evaluate_arguments(struct symbol *fn, struct expression_list *head) > sprintf(where, "argument %d", i); > compatible_argument_type(expr, target, p, where); > } > - > - i++; > NEXT_PTR_LIST(argtype); > + i++; > } END_FOR_EACH_PTR(expr); > FINISH_PTR_LIST(argtype); > + > return 1; > } This is not needed, no? > diff --git a/parse.c b/parse.c > index 9b0d40e..6b0a20b 100644 > --- a/parse.c > +++ b/parse.c > @@ -1078,7 +1078,7 @@ static struct token *attribute_format(struct token *token, struct symbol *attr, > fmt_sym = lookup_keyword(token->ident, NS_KEYWORD); > > if (!fmt_sym || !fmt_sym->op || > - fmt_sym->op != &attr_printf_op) { > + fmt_sym->op->type != KW_FORMAT) { > sparse_error(token->pos, > "unknown format type '%s'\n", > show_ident(token->ident)); This should be folded with the third patch. Kind regards, -- Luc