On 7 Jan 2022, at 16:08, René Scharfe <l.s.r@xxxxxx> wrote: > > Am 05.01.22 um 14:23 schrieb Jessica Clarke: >> Currently git_qsort_s allocates a buffer on the stack that has no >> alignment, and mem_pool_alloc assumes uintmax_t's size is adequate >> alignment for any type. >> >> On CHERI, and thus Arm's Morello prototype, pointers are implemented as >> hardware capabilities which, as well as having a normal integer address, >> have additional bounds, permissions and other metadata in a second word, >> so on a 64-bit architecture they are 128-bit quantities, including their >> alignment requirements. Despite being 128-bit, their integer component >> is still only a 64-bit field, so uintmax_t remains 64-bit, and therefore >> uintmax_t does not sufficiently align an allocation. >> >> Moreover, these capabilities have an additional "129th" tag bit, which >> tracks the validity of the capability and is cleared on any invalid >> operation that doesn't trap (e.g. partially overwriting a capability >> will invalidate it) which, combined with the architecture's strict >> checks on capability manipulation instructions, ensures it is >> architecturally impossible to construct a capability that gives more >> rights than those you were given in the first place. To store these tag >> bits, each capability sized and aligned word in memory gains a single >> tag bit that is stored in unaddressable (to the processor) memory. This >> means that it is impossible to store a capability at an unaligned >> address: a normal load or store of a capability will always take an >> alignment fault even if the (micro)architecture supports unaligned >> loads/stores for other data types, and a memcpy will, if the destination >> is not appropriately aligned, copy the byte representation but lose the >> tag, meaning that if it is eventually copied back and loaded from an >> aligned location any attempt to dereference it will trap with a tag >> fault. Thus, even char buffers that are memcpy'ed to or from must be >> properly aligned on CHERI architectures if they are to hold pointers. >> >> Address both of these by introducing a new git_max_align type put in a >> union with the on-stack buffer to force its alignment, as well as a new >> GIT_MAX_ALIGNMENT macro whose value is the alignment of git_max_align >> that gets used for mem_pool_alloc. As well as making the code work on >> CHERI, the former change likely also improves performance on some >> architectures by making memcpy faster (either because it can use larger >> block sizes or because the microarchitecture has inefficient unaligned >> accesses). >> >> Signed-off-by: Jessica Clarke <jrtc27@xxxxxxxxxx> >> --- >> compat/qsort_s.c | 11 +++++++---- >> git-compat-util.h | 11 +++++++++++ >> mem-pool.c | 6 +++--- >> 3 files changed, 21 insertions(+), 7 deletions(-) >> >> diff --git a/compat/qsort_s.c b/compat/qsort_s.c >> index 52d1f0a73d..1ccdb87451 100644 >> --- a/compat/qsort_s.c >> +++ b/compat/qsort_s.c >> @@ -49,16 +49,19 @@ int git_qsort_s(void *b, size_t n, size_t s, >> int (*cmp)(const void *, const void *, void *), void *ctx) >> { >> const size_t size = st_mult(n, s); >> - char buf[1024]; >> + union { >> + char buf[1024]; >> + git_max_align align; >> + } u; >> >> if (!n) >> return 0; >> if (!b || !cmp) >> return -1; >> >> - if (size < sizeof(buf)) { >> - /* The temporary array fits on the small on-stack buffer. */ >> - msort_with_tmp(b, n, s, cmp, buf, ctx); >> + if (size < sizeof(u.buf)) { >> + /* The temporary array fits in the small on-stack buffer. */ >> + msort_with_tmp(b, n, s, cmp, u.buf, ctx); > > So buf gets maximum alignment instead of char alignment (i.e. none) > because some callers use it to sort pointers, which need that on your > platform. Makes sense. Yes, exactly. >> } else { >> /* It's somewhat large, so malloc it. */ >> char *tmp = xmalloc(size); > > tmp is used instead of buf if the latter is not big enough, so it can > also contain pointers. No problem, because malloc(3) returns memory > that is properly aligned for anything already. Yes. > stable-qsort.c uses the same algorithm as compat/qsort_s.c, it just > lacks the context pointer. Shouldn't it get the same treatment? It > is used e.g. (via the macro STABLE_QSORT) in merge-ort.c to sort > pointers.. Indeed it should; I guess git_stable_qsort just isn’t used for most common commands, or at least not with structures containing pointers, as I haven’t seen it break yet. >> diff --git a/git-compat-util.h b/git-compat-util.h >> index 5fa54a7afe..28581a45c5 100644 >> --- a/git-compat-util.h >> +++ b/git-compat-util.h >> @@ -274,6 +274,17 @@ typedef unsigned long uintptr_t; >> #define _ALL_SOURCE 1 >> #endif >> >> +typedef union { >> + uintmax_t max_align_uintmax; >> + void *max_align_pointer; >> +} git_max_align; > > For your purposes just the void * member would suffice here, right? And > with the added uintmax_t this currently gets maximum alignment, suitable > for any of our objects. If we were to start using __int128 etc. then > we'd have to add that to this union as well to really get the maximum > possible alignment, though. For us, yes, uintmax_t’s alignment is never greater than void *’s. On most 32-bit non-CHERI architectures though that isn’t the case, so whilst omitting uintmax_t would be fine for qsort, it would break mem_pool_alloc’s alignment, as this gets used for both. Plus aligning to uintmax_t may still help performance for memcpy there (e.g. maybe it guarantees the memcpy implementation can use a load/store pair instruction that it wouldn’t otherwise necessarily be able to). Regarding __int128, yes, you would, though both GCC’s and FreeBSD’s stddef.h don’t admit __int128 exists. They do include long double in their max_align_t, though, which generally has the same alignment requirements as __int128 on 64-bit architectures, though I have a funny feeling it might not on some. >> + >> +typedef struct { >> + char unalign; >> + git_max_align aligned; >> +} git_max_alignment; >> +#define GIT_MAX_ALIGNMENT offsetof(git_max_alignment, aligned) > > C11 has alignas, alignof and max_align_t. We only recently started to > depend on some C99 features, so perhaps it's a bit early to use > stdalign.h in Git's code base. That's a pity, though. The > GIT_MAX_ALIGNMENT macro is sightly enough, but using a union to get > pointer alignment is a bit more cumbersome than something like > > alignas(alignof(max_align_t)) char buf[1024]; It is; max_align_t is our preferred solution and what our programming guide recommends, but not everyone is ready to jump to C11, so until then various bodges like this are needed. Jess >> + >> /* used on Mac OS X */ >> #ifdef PRECOMPOSE_UNICODE >> #include "compat/precompose_utf8.h" >> diff --git a/mem-pool.c b/mem-pool.c >> index ccdcad2e3d..748eff925a 100644 >> --- a/mem-pool.c >> +++ b/mem-pool.c >> @@ -69,9 +69,9 @@ void *mem_pool_alloc(struct mem_pool *pool, size_t len) >> struct mp_block *p = NULL; >> void *r; >> >> - /* round up to a 'uintmax_t' alignment */ >> - if (len & (sizeof(uintmax_t) - 1)) >> - len += sizeof(uintmax_t) - (len & (sizeof(uintmax_t) - 1)); >> + /* round up to a 'GIT_MAX_ALIGNMENT' alignment */ >> + if (len & (GIT_MAX_ALIGNMENT - 1)) >> + len += GIT_MAX_ALIGNMENT - (len & (GIT_MAX_ALIGNMENT - 1)); > > OK. > >> >> if (pool->mp_block && >> pool->mp_block->end - pool->mp_block->next_free >= len)