On 6 Jan 2022, at 22:53, Junio C Hamano <gitster@xxxxxxxxx> wrote: > > Jessica Clarke <jrtc27@xxxxxxxxxx> writes: > >> On CHERI, and thus Arm's Morello prototype, pointers are implemented as >> hardware capabilities which, as well as having a normal integer address, >> have additional bounds, permissions and other metadata in a second word. >> In order to preserve this metadata, uintptr_t is also implemented as a >> capability, not a plain integer, which causes problems for binary >> operators, as the metadata preserved in the output can only come from >> one of the inputs. In most cases this is clear, as normally at least one >> operand is provably a plain integer, but if both operands are uintptr_t >> and have no indication they're just plain integers then it is ambiguous, >> and the current implementation will arbitrarily, but deterministically, >> pick the left-hand side, due to empirical evidence that it is more >> likely to be correct. > > What's left-hand side in the context of the code you changed? > Between "what" vs "ent->util" you take "what"? That cannot be > true. Are you referring to the (usually hidden and useless when we > use it as an integer) "hardware capabilities" word as "left" vs the > value of the pointer as "right"? Left-hand side is what, right-hand side is ((uintptr_t)ent->util). The bounds/permissions/tag/etc need to be inherited from somewhere, and as an arbitrary empirically-best choice when otherwise ambiguous we choose the left, i.e. what. The alternative would just be to error, which will result in strictly more code failing to build for CHERI despite such a guess being correct most of the time. >> static uintptr_t register_symlink_changes(struct apply_state *state, >> const char *path, >> - uintptr_t what) >> + size_t what) >> { >> struct string_list_item *ent; >> >> @@ -3823,7 +3823,7 @@ static uintptr_t register_symlink_changes(struct apply_state *state, >> ent = string_list_insert(&state->symlink_changes, path); >> ent->util = (void *)0; >> } >> - ent->util = (void *)(what | ((uintptr_t)ent->util)); >> + ent->util = (void *)((uintptr_t)what | ((uintptr_t)ent->util)); >> return (uintptr_t)ent->util; >> } > > I actually wonder if it results in code that is much easier to > follow if we did: > > * Introduce an "enum apply_symlink_treatment" that has > APPLY_SYMLINK_GOES_AWAY and APPLY_SYMLINK_IN_RESULT as its > possible values; > > * Make register_symlink_changes() and check_symlink_changes() > work with "enum apply_symlink_treatment"; > > * (optional) stop using string_list() to store the symlink_changes; > use strintmap and use strintmap_set() and strintmap_get() to > access its entries, so that the ugly implementation detail > (i.e. "the container type we use only has a (void *) field to > store caller-supplied data, so we cast an integer and a pointer > back and forth") can be safely hidden. Those would be better if you want a less-minimal change. I can easily do the first two, but the last one may or may not take me a while to figure out given I’m not familiar with git’s C internals. Jess