Re: [PATCH] apply: Avoid ambiguous pointer provenance for CHERI/Arm's Morello

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 6 Jan 2022, at 22:53, Junio C Hamano <gitster@xxxxxxxxx> wrote:
> 
> Jessica Clarke <jrtc27@xxxxxxxxxx> writes:
> 
>> On CHERI, and thus Arm's Morello prototype, pointers are implemented as
>> hardware capabilities which, as well as having a normal integer address,
>> have additional bounds, permissions and other metadata in a second word.
>> In order to preserve this metadata, uintptr_t is also implemented as a
>> capability, not a plain integer, which causes problems for binary
>> operators, as the metadata preserved in the output can only come from
>> one of the inputs. In most cases this is clear, as normally at least one
>> operand is provably a plain integer, but if both operands are uintptr_t
>> and have no indication they're just plain integers then it is ambiguous,
>> and the current implementation will arbitrarily, but deterministically,
>> pick the left-hand side, due to empirical evidence that it is more
>> likely to be correct.
> 
> What's left-hand side in the context of the code you changed?
> Between "what" vs "ent->util" you take "what"?  That cannot be
> true.  Are you referring to the (usually hidden and useless when we
> use it as an integer) "hardware capabilities" word as "left" vs the
> value of the pointer as "right"?

Left-hand side is what, right-hand side is ((uintptr_t)ent->util). The
bounds/permissions/tag/etc need to be inherited from somewhere, and as
an arbitrary empirically-best choice when otherwise ambiguous we choose
the left, i.e. what. The alternative would just be to error, which will
result in strictly more code failing to build for CHERI despite such a
guess being correct most of the time.

>> static uintptr_t register_symlink_changes(struct apply_state *state,
>> 					  const char *path,
>> -					  uintptr_t what)
>> +					  size_t what)
>> {
>> 	struct string_list_item *ent;
>> 
>> @@ -3823,7 +3823,7 @@ static uintptr_t register_symlink_changes(struct apply_state *state,
>> 		ent = string_list_insert(&state->symlink_changes, path);
>> 		ent->util = (void *)0;
>> 	}
>> -	ent->util = (void *)(what | ((uintptr_t)ent->util));
>> +	ent->util = (void *)((uintptr_t)what | ((uintptr_t)ent->util));
>> 	return (uintptr_t)ent->util;
>> }
> 
> I actually wonder if it results in code that is much easier to
> follow if we did:
> 
> * Introduce an "enum apply_symlink_treatment" that has
>   APPLY_SYMLINK_GOES_AWAY and APPLY_SYMLINK_IN_RESULT as its
>   possible values;
> 
> * Make register_symlink_changes() and check_symlink_changes()
>   work with "enum apply_symlink_treatment";
> 
> * (optional) stop using string_list() to store the symlink_changes;
>   use strintmap and use strintmap_set() and strintmap_get() to
>   access its entries, so that the ugly implementation detail
>   (i.e. "the container type we use only has a (void *) field to
>   store caller-supplied data, so we cast an integer and a pointer
>   back and forth") can be safely hidden.

Those would be better if you want a less-minimal change. I can easily
do the first two, but the last one may or may not take me a while to
figure out given I’m not familiar with git’s C internals.

Jess





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux