Re: [PATCH] Properly align memory allocations and temporary buffers

Jessica Clarke <jrtc27@xxxxxxxxxx> · Fri, 7 Jan 2022 00:22:35 +0000

On 7 Jan 2022, at 00:10, Junio C Hamano <gitster@xxxxxxxxx> wrote:
> 
> Jessica Clarke <jrtc27@xxxxxxxxxx> writes:
> 
>> This is also true of uint128_t, it doesn’t fit in a uintmax_t either.
> 
> uintmax_t is supposed to be an unsigned integer type capable of
> representing any value of any unsigned integer type, so if you have
> 128-bit unsigned integer, your uintmax_t should be at last that
> wide, or your uintmax_t is not uintmax_t as far as C standard is
> concerned, no?

Yes. Every 64-bit architecture implemented by GCC and Clang violates
this. This is why uintmax_t is a terrible idea, it gets baked into your
ABI and you can’t add new integer types. People decided for 128-bit
integers it was better to add them than let uintmax_t constrain them.
We take the same approach for CHERI of not caring about uintmax_t. If
you want to hold this against CHERI, go file bugs against GCC and Clang
for violating the standard on x86_64, aarch64, mips64, powerpc64,
s390x, sparc64, and so on.

> uintptr_t is an unsigned integer type that any valid pointer to void
> can be converted to this type, then converted back to pointer to
> void, and the result will compare equal to the original pointer.  So
> the value of uintptr_t cannot be represented by uintmax_t, there is
> something wrong.
> 
>> uintmax_t was a mistake as it becomes part of the ABI and can never be
>> revised even when new integer types come along. uintmax_t can hold any
>> valid address, but will strip the metadata.
> 
> It is a flaw in the implementation of uintmax_t on the architecture
> that needs "the metadata", no?  If the implementation supports a
> notion of uintptr_t (i.e. there exists an unsigned integer type that
> can safely go back and forth from pointer to void), an unsigned
> integer type that is at least as wide as any unsigned integer type
> should certainly be able to hold what would fit in uintptr_t, no?

If you want to get really language-lawyer-y about it, you can actually
argue that this is a compliant implementation of the C standard.
Integer types are permitted to have padding bits, and some combinations
of padding bits are allowed to be trap representations. Technically, in
our representation, the metadata bits are padding bits, because they do
not contribute to the precision like value bits. It is therefore the
case that the *value* of a uintptr_t still fits into a uintmax_t, but
the latter has no padding bits, and casting the latter to the former
yields a trap representation when further cast back to a pointer. This
may not the intent of the spec, and not how anyone thinks of it because
CHERI is the first implementation that pushes the boundary here, but
it’s technically legal under that interpretation. You may disagree with
the interpretation, and I don’t like to use it most of the time because
it’s complicated and involves yet more ill-defined parts of the spec
(e.g. it says arithmetic operations on valid values (they mean objects,
I assume, as the value only includes value bits, but the input could be
a trap representation on some implementations) never generate a trap
representation other than as part of an exceptional condition such as
an overflow, but nowhere defines what counts as an arithmetic
operation).

Jess

> Puzzled.
>