On Wed, Nov 29, 2023 at 8:44 AM Yonghong Song <yonghong.song@xxxxxxxxx> wrote: > > > On 11/29/23 2:08 AM, Jose E. Marchesi wrote: > >> On 11/28/23 11:23 AM, Jose E. Marchesi wrote: > >>> [During LPC 2023 we talked about improving communication between the GCC > >>> BPF toolchain port and the kernel side. This is the first periodical > >>> report that we plan to publish in the GCC wiki and send to interested > >>> parties. Hopefully this will help.] > >>> > >>> GCC wiki page for the port: https://gcc.gnu.org/wiki/BPFBackEnd > >>> IRC channel: #gccbpf at irc.oftc.net. > >>> Help on using the port: gcc@xxxxxxxxxxx > >>> Patches and/or development discussions: gcc-patches@xxxxxxx > >> Thanks a lot for detailed report. Really helpful to nail down > >> issues facing one or both compilers. See comments below for > >> some mentioned issues. > >> > >>> Assembler > >>> ========= > >> [...] > >> > >>> - In the Pseudo-C syntax register names are not preceded by % characters > >>> nor any other prefix. A consequence of that is that in contexts like > >>> instruction operands, where both register names and expressions > >>> involving symbols are expected, there is no way to disambiguate > >>> between them. GAS was allowing symbols like `w3' or `r5' in syntactic > >>> contexts where no registers were expected, such as in: > >>> > >>> r0 = w3 ll ; GAS interpreted w3 as symbol, clang emits error > >>> > >>> The clang assembler wasn't allowing that. During LPC we agreed that > >>> the simplest approach is to not allow any symbol to have the same name > >>> than a register, in any context. So we changed GAS so it now doesn't > >>> allow to use register names as symbols in any expression, such as: > >>> > >>> r0 = w3 + 1 ll ; This now fails for both GAS and llvm. > >>> r0 = 1 + w3 ll ; NOTE this does not fail with llvm, but it should. > >> Could you provide a reproducible case above for llvm? llvm does not > >> support syntax like 'r0 = 1 + w3 ll'. For add, it only supports > >> 'r1 += r2' or 'r1 += 100' syntax. > > It is a 128-bit load with an expression. In compiler explorer, clang: > > > > int > > foo () > > { > > asm volatile ("r1 = 10 + w3 ll"); > > return 0; > > } > > > > I get: > > > > foo: # @foo > > r1 = 10+w3 ll > > r0 = 0 > > exit > > > > i.e. `10 + w3' is interpreted as an expression with two operands: the > > literal number 10 and a symbol (not a register) `w3'. > > > > If the expression is `w3+10' instead, your parser recognizes the w3 as a > > register name and errors out, as expected. > > > > I suppose llvm allows to hook on the expression parser to handle > > individual operands. That's how we handled this in GAS. > > Thanks for the code. I can reproduce the result with compiler explorer. > The following is the link https://godbolt.org/z/GEGexf1Pj > where I added -grecord-gcc-switches to dump compilation flags > into .s file. > > The following is the compiler explorer compilation command line: > /opt/compiler-explorer/clang-trunk-20231129/bin/clang-18 -g -o /app/output.s \ > -S --target=bpf -fcolor-diagnostics -gen-reproducer=off -O2 \ > -g -grecord-command-line /app/example.c > > I then compile the above C code with > clang -g -S --target=bpf -fcolor-diagnostics -gen-reproducer=off -O2 -g -grecord-command-line t.c > with identical flags. > > I tried locally with llvm16/17/18. They all failed compilation since > 'r1 = 10+w3 ll' cannot be recognized by the llvm. > We will investigate why llvm18 in compiler explorer compiles > differently from my local build. Is that a different issue from: https://github.com/compiler-explorer/compiler-explorer/issues/5701 ?