On Mon, 2024-01-08 at 13:21 -0800, Yonghong Song wrote: [...] > Thanks for the detailed analysis! Previously we intend to do the following: > > - When 32-bit value is passed to "r" constraint: > - for cpu v3/v4 a 32-bit register should be selected; > - for cpu v1/v2 a warning should be reported. > > So in the above, the desired asm code should be > > ... > # %bb.0: > call bar > #APP > w0 += 1 > #NO_APP > exit > ... > > for cpuv3/cpuv4. I guess some more work in llvm is needed > to achieve that. To make clang emit w0 the following modification is needed: diff --git a/llvm/lib/Target/BPF/BPFISelLowering.cpp b/llvm/lib/Target/BPF/BPFISelLowering.cpp index b20e09c7f95f..4c504d587ce6 100644 --- a/llvm/lib/Target/BPF/BPFISelLowering.cpp +++ b/llvm/lib/Target/BPF/BPFISelLowering.cpp @@ -265,6 +265,8 @@ BPFTargetLowering::getRegForInlineAsmConstraint(const TargetRegisterInfo *TRI, // GCC Constraint Letters switch (Constraint[0]) { case 'r': // GENERAL_REGS + if (HasAlu32 && VT == MVT::i32) + return std::make_pair(0U, &BPF::GPR32RegClass); return std::make_pair(0U, &BPF::GPRRegClass); case 'w': if (HasAlu32) However, as Alexei notes in the sibling thread, this leads to incompatibility with some existing inline assembly. E.g. there are two compilation errors in selftests. I'll write in some more detail in the sibling thread. > On the other hand, for cpuv3/v4, for regular C code, > I think the compiler might be already omitting the conversion and use w > register already. So I am not sure whether the patch [6] > is needed or not. Could you double check? Yes, for regular C code, generated assembly uses 32-bit registers as expected: echo $(cat <<EOF extern unsigned long bar(unsigned); void foo(void) { bar(bar(7)); } EOF ) | clang -mcpu=v3 -O2 --target=bpf -mcpu=v3 -x c -S -o - - ... foo: # @foo # %bb.0: w1 = 7 call bar w1 = w0 call bar exit