> On Tue, Jan 9, 2024 at 3:00 AM Jose E. Marchesi > <jose.marchesi@xxxxxxxxxx> wrote: >> >> > >> > Also need to align with GCC. (Jose cc-ed) >> >> GCC doesn't have an integrated assembler, so using -masm=pseudoc it just >> compiles the program above to: >> >> foo: >> call bar >> r0 += 1 >> exit >> >> Also, at the moment we don't support a "w" constraint, because the >> assembly-like assembly syntax we started with implies different >> instructions that interpret the values stored in the BPF 64-bit >> registers as 32-bit or 64-bit values, i.e. >> >> mov %r1, 1 >> mov32 %r1, 1 > > Heh. gcc tried to invent a traditional looking asm for bpf and instead > invented the above :) Very funny, but we didn't invent it. We took it from ubpf. > x86 and arm64 use single 'mov' and encode sub-registers as rax/eax or > x0/w0. Yes both targets support specifying portions of the 64-bit registers using pseudo-register names, which is a better approch vs. using explicit mnemonics for the 32-bit operations (mov32, add32, etc) because it makes it possible to specify which instruction to use in a per-operand basis, like making the mode of actually passed arguments in inline assembly to influence the operation to be performed. It is nice to have it also in BPF. > imo support of gcc-only asm style is an obstacle in gcc-bpf adoption. > It's not too far to reconsider supporting this. You can easily > remove the support and it will reduce your maintenance/support work. > It's a bit of a distraction in this thread too. > >> But then the pseudo-c assembly syntax (that we also support) translates >> some of the semantics of the instructions to the register names, >> creating the notion that BPF actually has both 32-bit registers and >> 64-bit registers, i.e. >> >> r1 += 1 >> w1 += 1 >> >> In GCC we support both assembly syntaxes and currently we lack the >> ability to emit 32-bit variants in templates like "%[reg] += 1", so I >> suppose we can introduce a "w" constraint to: >> >> 2. When pseudo-c assembly syntax is used, expect a 32-bit mode to match >> the operand and warn about operand size overflow whenever necessary, >> and then emit "w" instead of "r" as the register name. > > clang supports "w" constraint with -mcpu=v3,v4 and emits 'w' > as register name. > >> > And, the most importantly, we need a way to go back to old behavior, >> > since u32 var; asm("...":: "r"(var)); will now >> > allocate "w" register or warn. >> >> Is it really necessary to change the meaning of "r"? You can write >> templates like the one triggering this problem like: >> >> asm volatile ("%[reg] += 1"::[reg]"w"((unsigned)bar())); >> >> Then the checks above will be performed, driven by the particular >> constraint explicitly specified by the user, not driven by the type of >> the value passed as the operand. > > That's a good question. > For x86 "r" constraint means 8, 16, 32, or 64 bit integer. > For arm64 "r" constraint means 32 or 64 bit integer. > > and this is traditional behavior of "r" in other asms too: > AMDGPU - 32 or 64 > Hexagon - 32 or 64 > powerpc - 32 or 64 > risc-v - 32 or 64 > imo it makes sense for bpf asm to align with the rest so that: Yes you are right and I agree. It makes sense to follow the established practice where "r" can lead to any pseudo-register name depending on the mode of the operand, like in x86_64: char -> %al short -> %ax int -> %eax long int -> %rax And then add diagnostics conditioned on the availability of 32-bit instructions (alu32). > > asm volatile ("%[reg] += 1"::[reg]"r"((unsigned)bar())); would generate > w0 += 1, NO warn (with -mcpu=v3,v4; and a warn with -mcpu=v1,v2) > > asm volatile ("%[reg] += 1"::[reg]"r"((unsigned long)bar())); > r0 += 1, NO warn > > asm volatile ("%[reg] += 1"::[reg]"w"((unsigned)bar())); > w0 += 1, NO warn > > asm volatile ("%[reg] += 1"::[reg]"w"((unsigned long)bar())); > w0 += 1 and a warn (currently there is none in clang) Makes sense to me. > I think we can add "R" constraint to mean 64-bit register only: > > asm volatile ("%[reg] += 1"::[reg]"R"((unsigned)bar())); > r0 += 1 and a warn > > asm volatile ("%[reg] += 1"::[reg]"R"((unsigned long)bar())); > r0 += 1, NO warn The x86 target has similar constraints "q" (for %Rl registers) and "Q" (for %Rh registers) but not for 32 and 64 pseudo-registers that I can see.