Re: asm register constraint. Was: [PATCH v2 bpf-next 2/5] bpf: Introduce "volatile compare" macro

"Jose E. Marchesi" <jose.marchesi@xxxxxxxxxx> · Thu, 11 Jan 2024 11:34:28 +0100

> On Tue, Jan 9, 2024 at 3:00 AM Jose E. Marchesi
> <jose.marchesi@xxxxxxxxxx> wrote:
>>
>> >
>> > Also need to align with GCC. (Jose cc-ed)
>>
>> GCC doesn't have an integrated assembler, so using -masm=pseudoc it just
>> compiles the program above to:
>>
>>   foo:
>>         call bar
>>         r0 += 1
>>         exit
>>
>> Also, at the moment we don't support a "w" constraint, because the
>> assembly-like assembly syntax we started with implies different
>> instructions that interpret the values stored in the BPF 64-bit
>> registers as 32-bit or 64-bit values, i.e.
>>
>>   mov %r1, 1
>>   mov32 %r1, 1
>
> Heh. gcc tried to invent a traditional looking asm for bpf and instead
> invented the above :)

Very funny, but we didn't invent it.  We took it from ubpf.

> x86 and arm64 use single 'mov' and encode sub-registers as rax/eax or
> x0/w0.

Yes both targets support specifying portions of the 64-bit registers
using pseudo-register names, which is a better approch vs. using
explicit mnemonics for the 32-bit operations (mov32, add32, etc) because
it makes it possible to specify which instruction to use in a
per-operand basis, like making the mode of actually passed arguments in
inline assembly to influence the operation to be performed.

It is nice to have it also in BPF.

> imo support of gcc-only asm style is an obstacle in gcc-bpf adoption.
> It's not too far to reconsider supporting this. You can easily
> remove the support and it will reduce your maintenance/support work.
> It's a bit of a distraction in this thread too.
>
>> But then the pseudo-c assembly syntax (that we also support) translates
>> some of the semantics of the instructions to the register names,
>> creating the notion that BPF actually has both 32-bit registers and
>> 64-bit registers, i.e.
>>
>>   r1 += 1
>>   w1 += 1
>>
>> In GCC we support both assembly syntaxes and currently we lack the
>> ability to emit 32-bit variants in templates like "%[reg] += 1", so I
>> suppose we can introduce a "w" constraint to:
>>
>> 2. When pseudo-c assembly syntax is used, expect a 32-bit mode to match
>>    the operand and warn about operand size overflow whenever necessary,
>>    and then emit "w" instead of "r" as the register name.
>
> clang supports "w" constraint with -mcpu=v3,v4 and emits 'w'
> as register name.
>
>> > And, the most importantly, we need a way to go back to old behavior,
>> > since u32 var; asm("...":: "r"(var)); will now
>> > allocate "w" register or warn.
>>
>> Is it really necessary to change the meaning of "r"?  You can write
>> templates like the one triggering this problem like:
>>
>>   asm volatile ("%[reg] += 1"::[reg]"w"((unsigned)bar()));
>>
>> Then the checks above will be performed, driven by the particular
>> constraint explicitly specified by the user, not driven by the type of
>> the value passed as the operand.
>
> That's a good question.
> For x86 "r" constraint means 8, 16, 32, or 64 bit integer.
> For arm64 "r" constraint means 32 or 64 bit integer.
>
> and this is traditional behavior of "r" in other asms too:
> AMDGPU - 32 or 64
> Hexagon - 32 or 64
> powerpc - 32 or 64
> risc-v - 32 or 64
> imo it makes sense for bpf asm to align with the rest so that:

Yes you are right and I agree.  It makes sense to follow the established
practice where "r" can lead to any pseudo-register name depending on the
mode of the operand, like in x86_64:

   char     -> %al
   short    -> %ax
   int      -> %eax
   long int -> %rax

And then add diagnostics conditioned on the availability of 32-bit
instructions (alu32).

>
> asm volatile ("%[reg] += 1"::[reg]"r"((unsigned)bar())); would generate
> w0 += 1, NO warn (with -mcpu=v3,v4; and a warn with -mcpu=v1,v2)
>
> asm volatile ("%[reg] += 1"::[reg]"r"((unsigned long)bar()));
> r0 += 1, NO warn
>
> asm volatile ("%[reg] += 1"::[reg]"w"((unsigned)bar()));
> w0 += 1, NO warn
>
> asm volatile ("%[reg] += 1"::[reg]"w"((unsigned long)bar()));
> w0 += 1 and a warn (currently there is none in clang)

Makes sense to me.

> I think we can add "R" constraint to mean 64-bit register only:
>
> asm volatile ("%[reg] += 1"::[reg]"R"((unsigned)bar()));
> r0 += 1 and a warn
>
> asm volatile ("%[reg] += 1"::[reg]"R"((unsigned long)bar()));
> r0 += 1, NO warn

The x86 target has similar constraints "q" (for %Rl registers) and "Q"
(for %Rh registers) but not for 32 and 64 pseudo-registers that I can
see.