Powered by Linux
Re: [RFC PATCH 5/7] kernel_user_data: track parameter __untagged annotations — Semantic Matching Tool

Re: [RFC PATCH 5/7] kernel_user_data: track parameter __untagged annotations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Oct 07, 2019 at 04:35:43PM +0100, Andrew Murray wrote:
> The kernel provides a number of annotations that can be used by smatch
> for static analysis, for example __user and __kernel - the kernel relies
> on the GCC attribute 'address_space(x)' to implement these annotations.
> By adding a new annotation named __untagged with address space 5 we can
> use this annotation to indicate function parameters which should not
> be treated as tagged addresses.
> 
> Let's give Smatch an understanding of this annotation, when used by
> function parameters, by adding a '[u]' tag to the end of their user value
> ranges. We ensure that upon merges of range lists states, we only preserve
> the tag if both states contain the tag.
> 
> In other words, if two functions A and B both call function C (let's assume
> these functions all have a single parameter) - then the range list for
> function C's parameter will only contain the tag if the parameters for both
> A and B are also tagged. Therefore we know that if if we make a comparison
> between a tagged and untagged address in C then we know it doesn't need
> to flag a detection as the values concerned have all come from functions
> that are marked as __untagged.
> 
> If only function A was tagged, then we could raise a detection in C - and
> rely on parameter tracing to walk up both callers of C to build call trees
> and exclude anything and its parents that are tagged.
> 
> In practice we will always raise a detection and let the smdb.py script
> figure out which call trees to display.

This approach works well, however it sometimes falls over due to gaps in
the Smatch flow-tracking/parameter-tracking. E.g.

void func1(unsigned long __untagged addr, unsigned long len)
{
    unsigned long tmp = 1;
    tmp += addr;
    tmp = func2(tmp, len, 4);
    find_vma(NULL, tmp);
}

The parameter addr is annotated as untagged, therefore its range list may
look something like '0-u64max[u]'.

The first line of code results in a range list of tmp to be '1';

The second line of code assigns addr to tmp, resulting in tmp's range list
changing to '1-u64max[u]'

The third line of code passes tmp to a function and assigns the result back
to tmp. Just like any other function Smatch keeps track of all the callers
to func2, and when it parses the code in func2 it will consider 'tmp' to
have a range list of all those callers combined - if there is a caller of
func2 anywhere in the kernel that doesn't provide tmp with the '[u]' tag
then the tag will be dropped. This results in the return value of func2 not
containing a tag and resulting in 'tmp' inside func1 loosing it's '[u]' tag.
Sometimes Smatch is clever and can describe the range of values returned by a
function in terms relative to the input arguments, in that case the tag would
be preserved.

This means that find_vma, if it detects a tagged pointer will flag it seeing as
'tmp' does not contain the '[u]' tag. Fortunately, in this case we rely on
smdb.py to walk the tree from the detection, therefore it will see that func1
calls find_vma, and func1 'addr' is tagged and thus it can suppress the
detection. Unfortunately this relies on the ability of Smatch to detect that
'tmp' as passed to find_vma is the 'addr' in the function arguments - this
parameter tracking doesn't always work (for the same reasons as described). It is
for these reasons that using __untagged annotations is hit or miss.

Thanks,

Andrew Murray

> 
> Signed-off-by: Andrew Murray <andrew.murray@xxxxxxx>
> ---
>  smatch_estate.c           | 22 ++++++++++++
>  smatch_extra.h            |  3 ++
>  smatch_kernel_user_data.c | 71 +++++++++++++++++++++++++++++++++------
>  3 files changed, 86 insertions(+), 10 deletions(-)
> 
> diff --git a/smatch_estate.c b/smatch_estate.c
> index 5d0a1397acef..5650af88de21 100644
> --- a/smatch_estate.c
> +++ b/smatch_estate.c
> @@ -53,6 +53,9 @@ struct smatch_state *merge_estates(struct smatch_state *s1, struct smatch_state
>  	if (estate_capped(s1) && estate_capped(s2))
>  		estate_set_capped(tmp);
>  
> +	if (estate_treat_untagged(s1) && estate_treat_untagged(s2))
> +		estate_set_treat_untagged(tmp);
> +
>  	return tmp;
>  }
>  
> @@ -154,6 +157,23 @@ void estate_set_capped(struct smatch_state *state)
>  	get_dinfo(state)->capped = true;
>  }
>  
> +bool estate_treat_untagged(struct smatch_state *state)
> +{
> +	if (!state)
> +		return false;
> +
> +	/* impossible states are capped */
> +	if (!estate_rl(state))
> +		return true;
> +
> +	return get_dinfo(state)->treat_untagged;
> +}
> +
> +void estate_set_treat_untagged(struct smatch_state *state)
> +{
> +	get_dinfo(state)->treat_untagged = true;
> +}
> +
>  sval_t estate_min(struct smatch_state *state)
>  {
>  	return rl_min(estate_rl(state));
> @@ -204,6 +224,8 @@ int estates_equiv(struct smatch_state *one, struct smatch_state *two)
>  		return 0;
>  	if (estate_capped(one) != estate_capped(two))
>  		return 0;
> +	if (estate_treat_untagged(one) != estate_treat_untagged(two))
> +		return 0;
>  	if (strcmp(one->name, two->name) == 0)
>  		return 1;
>  	return 0;
> diff --git a/smatch_extra.h b/smatch_extra.h
> index 49cc7c72ad19..40d59baf253d 100644
> --- a/smatch_extra.h
> +++ b/smatch_extra.h
> @@ -31,6 +31,7 @@ struct data_info {
>  	sval_t fuzzy_max;
>  	unsigned int hard_max:1;
>  	unsigned int capped:1;
> +	unsigned int treat_untagged:1;
>  };
>  DECLARE_ALLOCATOR(data_info);
>  
> @@ -148,6 +149,8 @@ void estate_clear_hard_max(struct smatch_state *state);
>  int estate_get_hard_max(struct smatch_state *state, sval_t *sval);
>  bool estate_capped(struct smatch_state *state);
>  void estate_set_capped(struct smatch_state *state);
> +bool estate_treat_untagged(struct smatch_state *state);
> +void estate_set_treat_untagged(struct smatch_state *state);
>  
>  int estate_get_single_value(struct smatch_state *state, sval_t *sval);
>  struct smatch_state *get_implied_estate(struct expression *expr);
> diff --git a/smatch_kernel_user_data.c b/smatch_kernel_user_data.c
> index 9e8b6a7f3506..52ed875f00bf 100644
> --- a/smatch_kernel_user_data.c
> +++ b/smatch_kernel_user_data.c
> @@ -100,6 +100,8 @@ static void pre_merge_hook(struct sm_state *sm)
>  	state = alloc_estate_rl(clone_rl(rl));
>  	if (estate_capped(user) || is_capped_var_sym(sm->name, sm->sym))
>  		estate_set_capped(state);
> +	if (estate_treat_untagged(user))
> +		estate_set_treat_untagged(state);
>  	set_state(my_id, sm->name, sm->sym, state);
>  }
>  
> @@ -117,6 +119,8 @@ static void extra_nomod_hook(const char *name, struct symbol *sym, struct expres
>  	new = alloc_estate_rl(rl);
>  	if (estate_capped(user))
>  		estate_set_capped(new);
> +	if (estate_treat_untagged(user))
> +		estate_set_treat_untagged(new);
>  	set_state(my_id, name, sym, new);
>  }
>  
> @@ -174,6 +178,30 @@ bool user_rl_capped(struct expression *expr)
>  	return true;  /* not actually user data */
>  }
>  
> +bool user_rl_treat_untagged(struct expression *expr)
> +{
> +	struct smatch_state *state;
> +	struct range_list *rl;
> +	sval_t sval;
> +
> +	char *var_name = expr_to_var(expr);
> +
> +	expr = strip_expr(expr);
> +	if (!expr)
> +		return false;
> +	if (get_value(expr, &sval))
> +		return true;
> +
> +	state = get_state_expr(my_id, expr);
> +	if (state)
> +		return estate_treat_untagged(state);
> +
> +	if (get_user_rl(expr, &rl))
> +		return false;  /* uncapped user data */
> +
> +	return true;  /* not actually user data */
> +}
> +
>  static void tag_inner_struct_members(struct expression *expr, struct symbol *member)
>  {
>  	struct expression *edge_member;
> @@ -575,6 +603,8 @@ static bool handle_op_assign(struct expression *expr)
>  		state = alloc_estate_rl(rl);
>  		if (user_rl_capped(binop_expr))
>  			estate_set_capped(state);
> +		if (user_rl_treat_untagged(expr->left))
> +			estate_set_treat_untagged(state);
>  		set_state_expr(my_id, expr->left, state);
>  		return true;
>  	}
> @@ -616,8 +646,11 @@ static void match_assign(struct expression *expr)
>  
>  	rl = cast_rl(get_type(expr->left), rl);
>  	state = alloc_estate_rl(rl);
> +
>  	if (user_rl_capped(expr->right))
>  		estate_set_capped(state);
> +	if (user_rl_treat_untagged(expr->right))
> +		estate_set_treat_untagged(state);
>  	set_state_expr(my_id, expr->left, state);
>  
>  	return;
> @@ -1008,8 +1041,10 @@ static char *get_user_rl_str(struct expression *expr, struct symbol *type)
>  	if (!get_user_rl(expr, &rl))
>  		return NULL;
>  	rl = cast_rl(type, rl);
> -	snprintf(buf, sizeof(buf), "%s%s",
> -		 show_rl(rl), user_rl_capped(expr) ? "[c]" : "");
> +	snprintf(buf, sizeof(buf), "%s%s%s",
> +		 show_rl(rl),
> +		 user_rl_capped(expr) ? "[c]" : "",
> +		 user_rl_treat_untagged(expr) ? "[u]" : "");
>  	return buf;
>  }
>  
> @@ -1081,8 +1116,9 @@ static void struct_member_callback(struct expression *call, int param, char *pri
>  	if (!rl)
>  		return;
>  
> -	snprintf(buf, sizeof(buf), "%s%s", show_rl(rl),
> -		 estate_capped(sm->state) ? "[c]" : "");
> +	snprintf(buf, sizeof(buf), "%s%s%s", show_rl(rl),
> +		 estate_capped(sm->state) ? "[c]" : "",
> +		 estate_treat_untagged(sm->state) ? "[u]" : "");
>  	sql_insert_caller_info(call, USER_DATA, param, printed_name, buf);
>  }
>  
> @@ -1121,6 +1157,13 @@ static bool param_data_capped(const char *value)
>  	return false;
>  }
>  
> +static bool param_data_treat_untagged(const char *value)
> +{
> +	if (strstr(value, ",u") || strstr(value, "[u"))
> +		return true;
> +	return false;
> +}
> +
>  static void set_param_user_data(const char *name, struct symbol *sym, char *key, char *value)
>  {
>  	struct range_list *rl = NULL;
> @@ -1168,6 +1211,8 @@ static void set_param_user_data(const char *name, struct symbol *sym, char *key,
>  	state = alloc_estate_rl(rl);
>  	if (param_data_capped(value) || is_capped(expr))
>  		estate_set_capped(state);
> +	if (param_data_treat_untagged(value) || sym->ctype.as == 5)
> +		estate_set_treat_untagged(state);
>  	set_state(my_id, fullname, sym, state);
>  }
>  
> @@ -1240,6 +1285,8 @@ static void set_to_user_data(struct expression *expr, char *key, char *value)
>  	state = alloc_estate_rl(rl);
>  	if (param_data_capped(value))
>  		estate_set_capped(state);
> +	if (param_data_treat_untagged(value))
> +		estate_set_treat_untagged(state);
>  	set_state(my_id, name, sym, state);
>  free:
>  	free_string(name);
> @@ -1347,9 +1394,10 @@ static void param_set_to_user_data(int return_id, char *return_ranges, struct ex
>  		if (strcmp(param_name, "$") == 0)  /* The -1 param is handled after the loop */
>  			continue;
>  
> -		snprintf(buf, sizeof(buf), "%s%s",
> +		snprintf(buf, sizeof(buf), "%s%s%s",
>  			 show_rl(estate_rl(sm->state)),
> -			 estate_capped(sm->state) ? "[c]" : "");
> +			 estate_capped(sm->state) ? "[c]" : "",
> +			 estate_treat_untagged(sm->state) ? "[u]" : "");
>  		sql_insert_return_states(return_id, return_ranges,
>  					 func_gets_user_data ? USER_DATA_SET : USER_DATA,
>  					 param, param_name, buf);
> @@ -1369,9 +1417,10 @@ static void param_set_to_user_data(int return_id, char *return_ranges, struct ex
>  			return_found = true;
>  		if (strcmp(param_name, "*$") == 0)
>  			pointed_at_found = true;
> -		snprintf(buf, sizeof(buf), "%s%s",
> +		snprintf(buf, sizeof(buf), "%s%s%s",
>  			 show_rl(estate_rl(sm->state)),
> -			 estate_capped(sm->state) ? "[c]" : "");
> +			 estate_capped(sm->state) ? "[c]" : "",
> +			 estate_treat_untagged(sm->state) ? "[u]" : "");
>  		sql_insert_return_states(return_id, return_ranges,
>  					 func_gets_user_data ? USER_DATA_SET : USER_DATA,
>  					 -1, param_name, buf);
> @@ -1379,8 +1428,10 @@ static void param_set_to_user_data(int return_id, char *return_ranges, struct ex
>  
>  	/* This if for "return ntohl(foo);" */
>  	if (!return_found && get_user_rl(expr, &rl)) {
> -		snprintf(buf, sizeof(buf), "%s%s",
> -			 show_rl(rl), user_rl_capped(expr) ? "[c]" : "");
> +		snprintf(buf, sizeof(buf), "%s%s%s",
> +			 show_rl(rl),
> +			 user_rl_capped(expr) ? "[c]" : "",
> +			 user_rl_treat_untagged(expr) ? "[u]" : "");
>  		sql_insert_return_states(return_id, return_ranges,
>  					 func_gets_user_data ? USER_DATA_SET : USER_DATA,
>  					 -1, "$", buf);
> -- 
> 2.21.0
> 



[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Big List of Linux Books]

  Powered by Linux