(+cc René as another possible reviewer) On Wed, Jan 05, 2022 at 01:23:24PM +0000, Jessica Clarke wrote: > Currently git_qsort_s allocates a buffer on the stack that has no > alignment, and mem_pool_alloc assumes uintmax_t's size is adequate > alignment for any type. > > On CHERI, and thus Arm's Morello prototype, pointers are implemented as > hardware capabilities which, as well as having a normal integer address, > have additional bounds, permissions and other metadata in a second word, > so on a 64-bit architecture they are 128-bit quantities, including their > alignment requirements. Despite being 128-bit, their integer component > is still only a 64-bit field, so uintmax_t remains 64-bit, and therefore > uintmax_t does not sufficiently align an allocation. > > Moreover, these capabilities have an additional "129th" tag bit, which > tracks the validity of the capability and is cleared on any invalid > operation that doesn't trap (e.g. partially overwriting a capability > will invalidate it) which, combined with the architecture's strict > checks on capability manipulation instructions, ensures it is > architecturally impossible to construct a capability that gives more > rights than those you were given in the first place. To store these tag > bits, each capability sized and aligned word in memory gains a single > tag bit that is stored in unaddressable (to the processor) memory. This > means that it is impossible to store a capability at an unaligned > address: a normal load or store of a capability will always take an > alignment fault even if the (micro)architecture supports unaligned > loads/stores for other data types, and a memcpy will, if the destination > is not appropriately aligned, copy the byte representation but lose the > tag, meaning that if it is eventually copied back and loaded from an > aligned location any attempt to dereference it will trap with a tag > fault. Thus, even char buffers that are memcpy'ed to or from must be > properly aligned on CHERI architectures if they are to hold pointers. > > Address both of these by introducing a new git_max_align type put in a > union with the on-stack buffer to force its alignment, as well as a new > GIT_MAX_ALIGNMENT macro whose value is the alignment of git_max_align > that gets used for mem_pool_alloc. As well as making the code work on > CHERI, the former change likely also improves performance on some > architectures by making memcpy faster (either because it can use larger > block sizes or because the microarchitecture has inefficient unaligned > accesses). > > Signed-off-by: Jessica Clarke <jrtc27@xxxxxxxxxx> > --- > compat/qsort_s.c | 11 +++++++---- > git-compat-util.h | 11 +++++++++++ > mem-pool.c | 6 +++--- > 3 files changed, 21 insertions(+), 7 deletions(-) > > diff --git a/compat/qsort_s.c b/compat/qsort_s.c > index 52d1f0a73d..1ccdb87451 100644 > --- a/compat/qsort_s.c > +++ b/compat/qsort_s.c > @@ -49,16 +49,19 @@ int git_qsort_s(void *b, size_t n, size_t s, > int (*cmp)(const void *, const void *, void *), void *ctx) > { > const size_t size = st_mult(n, s); > - char buf[1024]; > + union { > + char buf[1024]; > + git_max_align align; > + } u; I'm not sure I understand. Clearly this union aligns buf along the width of git_max_align. But what about the preimage makes buf unaligned? > diff --git a/git-compat-util.h b/git-compat-util.h > index 5fa54a7afe..28581a45c5 100644 > --- a/git-compat-util.h > +++ b/git-compat-util.h > @@ -274,6 +274,17 @@ typedef unsigned long uintptr_t; > #define _ALL_SOURCE 1 > #endif > > +typedef union { > + uintmax_t max_align_uintmax; > + void *max_align_pointer; > +} git_max_align; OK, the purpose of this union is to be as wide as the least common alignment between uintmax_t and void *, yes? > + > +typedef struct { > + char unalign; > + git_max_align aligned; > +} git_max_alignment; > +#define GIT_MAX_ALIGNMENT offsetof(git_max_alignment, aligned) ...then the offset of the aligned field within the git_max_alignment struct is going to be that common alignment? Could you not `#define GIT_MAX_ALIGNMENT` to be `sizeof(git_max_align)` directly, or is there something I'm missing? I suppose the way you wrote it here is done in order to prevent padding on the end of the git_max_align union from artificially increasing the value of GIT_MAX_ALIGNMENT. In any case, I *think* what you wrote here is right. The typedef's are uncommon to our codebase, though. I wonder how much of this is all necessary. Thanks, Taylor