On 11/29/23 12:01 PM, Alexei Starovoitov wrote:
On Wed, Nov 29, 2023 at 8:44 AM Yonghong Song <yonghong.song@xxxxxxxxx> wrote:
On 11/29/23 2:08 AM, Jose E. Marchesi wrote:
On 11/28/23 11:23 AM, Jose E. Marchesi wrote:
[During LPC 2023 we talked about improving communication between the GCC
BPF toolchain port and the kernel side. This is the first periodical
report that we plan to publish in the GCC wiki and send to interested
parties. Hopefully this will help.]
GCC wiki page for the port: https://gcc.gnu.org/wiki/BPFBackEnd
IRC channel: #gccbpf at irc.oftc.net.
Help on using the port: gcc@xxxxxxxxxxx
Patches and/or development discussions: gcc-patches@xxxxxxx
Thanks a lot for detailed report. Really helpful to nail down
issues facing one or both compilers. See comments below for
some mentioned issues.
Assembler
=========
[...]
- In the Pseudo-C syntax register names are not preceded by % characters
nor any other prefix. A consequence of that is that in contexts like
instruction operands, where both register names and expressions
involving symbols are expected, there is no way to disambiguate
between them. GAS was allowing symbols like `w3' or `r5' in syntactic
contexts where no registers were expected, such as in:
r0 = w3 ll ; GAS interpreted w3 as symbol, clang emits error
The clang assembler wasn't allowing that. During LPC we agreed that
the simplest approach is to not allow any symbol to have the same name
than a register, in any context. So we changed GAS so it now doesn't
allow to use register names as symbols in any expression, such as:
r0 = w3 + 1 ll ; This now fails for both GAS and llvm.
r0 = 1 + w3 ll ; NOTE this does not fail with llvm, but it should.
Could you provide a reproducible case above for llvm? llvm does not
support syntax like 'r0 = 1 + w3 ll'. For add, it only supports
'r1 += r2' or 'r1 += 100' syntax.
It is a 128-bit load with an expression. In compiler explorer, clang:
int
foo ()
{
asm volatile ("r1 = 10 + w3 ll");
return 0;
}
I get:
foo: # @foo
r1 = 10+w3 ll
r0 = 0
exit
i.e. `10 + w3' is interpreted as an expression with two operands: the
literal number 10 and a symbol (not a register) `w3'.
If the expression is `w3+10' instead, your parser recognizes the w3 as a
register name and errors out, as expected.
I suppose llvm allows to hook on the expression parser to handle
individual operands. That's how we handled this in GAS.
Thanks for the code. I can reproduce the result with compiler explorer.
The following is the link https://godbolt.org/z/GEGexf1Pj
where I added -grecord-gcc-switches to dump compilation flags
into .s file.
The following is the compiler explorer compilation command line:
/opt/compiler-explorer/clang-trunk-20231129/bin/clang-18 -g -o /app/output.s \
-S --target=bpf -fcolor-diagnostics -gen-reproducer=off -O2 \
-g -grecord-command-line /app/example.c
I then compile the above C code with
clang -g -S --target=bpf -fcolor-diagnostics -gen-reproducer=off -O2 -g -grecord-command-line t.c
with identical flags.
I tried locally with llvm16/17/18. They all failed compilation since
'r1 = 10+w3 ll' cannot be recognized by the llvm.
We will investigate why llvm18 in compiler explorer compiles
differently from my local build.
Is that a different issue from:
https://github.com/compiler-explorer/compiler-explorer/issues/5701
?
Yes, it is a different one. I verified that the issue #5701 has been fixed.