Re: [PATCH] apply: Avoid ambiguous pointer provenance for CHERI/Arm's Morello

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 7 Jan 2022, at 12:16, René Scharfe <l.s.r@xxxxxx> wrote:
> 
> Am 06.01.22 um 23:53 schrieb Junio C Hamano:
>> Jessica Clarke <jrtc27@xxxxxxxxxx> writes:
>> 
>>> On CHERI, and thus Arm's Morello prototype, pointers are implemented as
>>> hardware capabilities which, as well as having a normal integer address,
>>> have additional bounds, permissions and other metadata in a second word.
>>> In order to preserve this metadata, uintptr_t is also implemented as a
>>> capability, not a plain integer, which causes problems for binary
>>> operators, as the metadata preserved in the output can only come from
>>> one of the inputs. In most cases this is clear, as normally at least one
>>> operand is provably a plain integer, but if both operands are uintptr_t
>>> and have no indication they're just plain integers then it is ambiguous,
>>> and the current implementation will arbitrarily, but deterministically,
>>> pick the left-hand side, due to empirical evidence that it is more
>>> likely to be correct.
>> 
>> What's left-hand side in the context of the code you changed?
>> Between "what" vs "ent->util" you take "what"?  That cannot be
>> true.  Are you referring to the (usually hidden and useless when we
>> use it as an integer) "hardware capabilities" word as "left" vs the
>> value of the pointer as "right"?
>> 
>>> static uintptr_t register_symlink_changes(struct apply_state *state,
>>> 					  const char *path,
>>> -					  uintptr_t what)
>>> +					  size_t what)
>>> {
>>> 	struct string_list_item *ent;
>>> 
>>> @@ -3823,7 +3823,7 @@ static uintptr_t register_symlink_changes(struct apply_state *state,
>>> 		ent = string_list_insert(&state->symlink_changes, path);
>>> 		ent->util = (void *)0;
>>> 	}
>>> -	ent->util = (void *)(what | ((uintptr_t)ent->util));
>>> +	ent->util = (void *)((uintptr_t)what | ((uintptr_t)ent->util));
>>> 	return (uintptr_t)ent->util;
>>> }
>> 
>> I actually wonder if it results in code that is much easier to
>> follow if we did:
>> 
>> * Introduce an "enum apply_symlink_treatment" that has
>>   APPLY_SYMLINK_GOES_AWAY and APPLY_SYMLINK_IN_RESULT as its
>>   possible values;
>> 
>> * Make register_symlink_changes() and check_symlink_changes()
>>   work with "enum apply_symlink_treatment";
>> 
>> * (optional) stop using string_list() to store the symlink_changes;
>>   use strintmap and use strintmap_set() and strintmap_get() to
>>   access its entries, so that the ugly implementation detail
>>   (i.e. "the container type we use only has a (void *) field to
>>   store caller-supplied data, so we cast an integer and a pointer
>>   back and forth") can be safely hidden.
>> 
> Or strsets -- we only need two.
> 
> --- >8 ---
> Subject: [PATCH] apply: use strsets to track symlinks
> 
> Symlink changes are tracked in a string_list, with the util pointer
> value indicating whether a symlink is kept or removed.  Using fake
> pointer values requires awkward casts.  Use one strset for each type of
> change instead to simplify and shorten the code.
> 
> Original-patch-by: Jessica Clarke <jrtc27@xxxxxxxxxx>
> Signed-off-by: René Scharfe <l.s.r@xxxxxx>

Thanks, this patch makes sense to me. Incidentally, seeing the bigger
picture as a result of this patch touching everywhere that used that
list, I can see that in fact the existing code would have worked, just
with the compiler warning that something potentially iffy was going on.
I had assumed ent->util was still sometimes storing an actual pointer,
with the low bits being used as flags, as many things tend to do, but
in fact it was always NULL plus a couple of flag bits, so both sides of
the | always had the same bounds/permissions/tag, that of NULL (i.e.
tag cleared as invalid, full bounds). This still looks like a nice
cleanup though.

Jess

> ---
> apply.c | 42 ++++++++----------------------------------
> apply.h | 26 +++++++++++---------------
> 2 files changed, 19 insertions(+), 49 deletions(-)
> 
> diff --git a/apply.c b/apply.c
> index fed195250b..7deb4f79fd 100644
> --- a/apply.c
> +++ b/apply.c
> @@ -103,7 +103,8 @@ int init_apply_state(struct apply_state *state,
> 	state->linenr = 1;
> 	string_list_init_nodup(&state->fn_table);
> 	string_list_init_nodup(&state->limit_by_name);
> -	string_list_init_nodup(&state->symlink_changes);
> +	strset_init(&state->removed_symlinks);
> +	strset_init(&state->kept_symlinks);
> 	strbuf_init(&state->root, 0);
> 
> 	git_apply_config();
> @@ -117,7 +118,8 @@ int init_apply_state(struct apply_state *state,
> void clear_apply_state(struct apply_state *state)
> {
> 	string_list_clear(&state->limit_by_name, 0);
> -	string_list_clear(&state->symlink_changes, 0);
> +	strset_clear(&state->removed_symlinks);
> +	strset_clear(&state->kept_symlinks);
> 	strbuf_release(&state->root);
> 
> 	/* &state->fn_table is cleared at the end of apply_patch() */
> @@ -3812,59 +3814,31 @@ static int check_to_create(struct apply_state *state,
> 	return 0;
> }
> 
> -static uintptr_t register_symlink_changes(struct apply_state *state,
> -					  const char *path,
> -					  uintptr_t what)
> -{
> -	struct string_list_item *ent;
> -
> -	ent = string_list_lookup(&state->symlink_changes, path);
> -	if (!ent) {
> -		ent = string_list_insert(&state->symlink_changes, path);
> -		ent->util = (void *)0;
> -	}
> -	ent->util = (void *)(what | ((uintptr_t)ent->util));
> -	return (uintptr_t)ent->util;
> -}
> -
> -static uintptr_t check_symlink_changes(struct apply_state *state, const char *path)
> -{
> -	struct string_list_item *ent;
> -
> -	ent = string_list_lookup(&state->symlink_changes, path);
> -	if (!ent)
> -		return 0;
> -	return (uintptr_t)ent->util;
> -}
> -
> static void prepare_symlink_changes(struct apply_state *state, struct patch *patch)
> {
> 	for ( ; patch; patch = patch->next) {
> 		if ((patch->old_name && S_ISLNK(patch->old_mode)) &&
> 		    (patch->is_rename || patch->is_delete))
> 			/* the symlink at patch->old_name is removed */
> -			register_symlink_changes(state, patch->old_name, APPLY_SYMLINK_GOES_AWAY);
> +			strset_add(&state->removed_symlinks, patch->old_name);
> 
> 		if (patch->new_name && S_ISLNK(patch->new_mode))
> 			/* the symlink at patch->new_name is created or remains */
> -			register_symlink_changes(state, patch->new_name, APPLY_SYMLINK_IN_RESULT);
> +			strset_add(&state->kept_symlinks, patch->new_name);
> 	}
> }
> 
> static int path_is_beyond_symlink_1(struct apply_state *state, struct strbuf *name)
> {
> 	do {
> -		unsigned int change;
> -
> 		while (--name->len && name->buf[name->len] != '/')
> 			; /* scan backwards */
> 		if (!name->len)
> 			break;
> 		name->buf[name->len] = '\0';
> -		change = check_symlink_changes(state, name->buf);
> -		if (change & APPLY_SYMLINK_IN_RESULT)
> +		if (strset_contains(&state->kept_symlinks, name->buf))
> 			return 1;
> -		if (change & APPLY_SYMLINK_GOES_AWAY)
> +		if (strset_contains(&state->removed_symlinks, name->buf))
> 			/*
> 			 * This cannot be "return 0", because we may
> 			 * see a new one created at a higher level.
> diff --git a/apply.h b/apply.h
> index 16202da160..4052da50c0 100644
> --- a/apply.h
> +++ b/apply.h
> @@ -4,6 +4,7 @@
> #include "hash.h"
> #include "lockfile.h"
> #include "string-list.h"
> +#include "strmap.h"
> 
> struct repository;
> 
> @@ -25,20 +26,6 @@ enum apply_verbosity {
> 	verbosity_verbose = 1
> };
> 
> -/*
> - * We need to keep track of how symlinks in the preimage are
> - * manipulated by the patches.  A patch to add a/b/c where a/b
> - * is a symlink should not be allowed to affect the directory
> - * the symlink points at, but if the same patch removes a/b,
> - * it is perfectly fine, as the patch removes a/b to make room
> - * to create a directory a/b so that a/b/c can be created.
> - *
> - * See also "struct string_list symlink_changes" in "struct
> - * apply_state".
> - */
> -#define APPLY_SYMLINK_GOES_AWAY 01
> -#define APPLY_SYMLINK_IN_RESULT 02
> -
> struct apply_state {
> 	const char *prefix;
> 
> @@ -86,7 +73,16 @@ struct apply_state {
> 
> 	/* Various "current state" */
> 	int linenr; /* current line number */
> -	struct string_list symlink_changes; /* we have to track symlinks */
> +	/*
> +	 * We need to keep track of how symlinks in the preimage are
> +	 * manipulated by the patches.  A patch to add a/b/c where a/b
> +	 * is a symlink should not be allowed to affect the directory
> +	 * the symlink points at, but if the same patch removes a/b,
> +	 * it is perfectly fine, as the patch removes a/b to make room
> +	 * to create a directory a/b so that a/b/c can be created.
> +	 */
> +	struct strset removed_symlinks;
> +	struct strset kept_symlinks;
> 
> 	/*
> 	 * For "diff-stat" like behaviour, we keep track of the biggest change
> --
> 2.34.1





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux