On Sat, Jul 27, 2019 at 7:11 PM Yonghong Song <yhs@xxxxxx> wrote: > > > > On 7/27/19 1:16 AM, Sedat Dilek wrote: > > On Sat, Jul 27, 2019 at 9:36 AM Sedat Dilek <sedat.dilek@xxxxxxxxx> wrote: > >> > >> On Sat, Jul 27, 2019 at 4:24 AM Alexei Starovoitov > >> <alexei.starovoitov@xxxxxxxxx> wrote: > >>> > >>> On Fri, Jul 26, 2019 at 2:19 PM Sedat Dilek <sedat.dilek@xxxxxxxxx> wrote: > >>>> > >>>> On Fri, Jul 26, 2019 at 11:10 PM Yonghong Song <yhs@xxxxxx> wrote: > >>>>> > >>>>> > >>>>> > >>>>> On 7/26/19 2:02 PM, Sedat Dilek wrote: > >>>>>> On Fri, Jul 26, 2019 at 10:38 PM Sedat Dilek <sedat.dilek@xxxxxxxxx> wrote: > >>>>>>> > >>>>>>> Hi Yonghong Song, > >>>>>>> > >>>>>>> On Fri, Jul 26, 2019 at 5:45 PM Yonghong Song <yhs@xxxxxx> wrote: > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> On 7/26/19 1:26 AM, Sedat Dilek wrote: > >>>>>>>>> Hi, > >>>>>>>>> > >>>>>>>>> I have opened a new issue in the ClangBuiltLinux issue tracker. > >>>>>>>> > >>>>>>>> Glad to know clang 9 has asm goto support and now It can compile > >>>>>>>> kernel again. > >>>>>>>> > >>>>>>> > >>>>>>> Yupp. > >>>>>>> > >>>>>>>>> > >>>>>>>>> I am seeing a problem in the area bpf/seccomp causing > >>>>>>>>> systemd/journald/udevd services to fail. > >>>>>>>>> > >>>>>>>>> [Fri Jul 26 08:08:43 2019] systemd[453]: systemd-udevd.service: Failed > >>>>>>>>> to connect stdout to the journal socket, ignoring: Connection refused > >>>>>>>>> > >>>>>>>>> This happens when I use the (LLVM) LLD ld.lld-9 linker but not with > >>>>>>>>> BFD linker ld.bfd on Debian/buster AMD64. > >>>>>>>>> In both cases I use clang-9 (prerelease). > >>>>>>>> > >>>>>>>> Looks like it is a lld bug. > >>>>>>>> > >>>>>>>> I see the stack trace has __bpf_prog_run32() which is used by > >>>>>>>> kernel bpf interpreter. Could you try to enable bpf jit > >>>>>>>> sysctl net.core.bpf_jit_enable = 1 > >>>>>>>> If this passed, it will prove it is interpreter related. > >>>>>>>> > >>>>>>> > >>>>>>> After... > >>>>>>> > >>>>>>> sysctl -w net.core.bpf_jit_enable=1 > >>>>>>> > >>>>>>> I can start all failed systemd services. > >>>>>>> > >>>>>>> systemd-journald.service > >>>>>>> systemd-udevd.service > >>>>>>> haveged.service > >>>>>>> > >>>>>>> This is in maintenance mode. > >>>>>>> > >>>>>>> What is next: Do set a permanent sysctl setting for net.core.bpf_jit_enable? > >>>>>>> > >>>>>> > >>>>>> This is what I did: > >>>>> > >>>>> I probably won't have cycles to debug this potential lld issue. > >>>>> Maybe you already did, I suggest you put enough reproducible > >>>>> details in the bug you filed against lld so they can take a look. > >>>>> > >>>> > >>>> I understand and will put the journalctl-log into the CBL issue > >>>> tracker and update informations. > >>>> > >>>> Thanks for your help understanding the BPF correlations. > >>>> > >>>> Is setting 'net.core.bpf_jit_enable = 2' helpful here? > >>> > >>> jit_enable=1 is enough. > >>> Or use CONFIG_BPF_JIT_ALWAYS_ON to workaround. > >>> > >>> It sounds like clang miscompiles interpreter. > > > > Just to clarify: > > This does not happen with clang-9 + ld.bfd (GNU/ld linker). > > > >>> modprobe test_bpf > >>> should be able to point out which part of interpreter is broken. > >> > >> Maybe we need something like... > >> > >> "bpf: Disable GCC -fgcse optimization for ___bpf_prog_run()" > >> > >> ...for clang? > >> > > > > Not sure if something like GCC's... > > > > -fgcse > > > > Perform a global common subexpression elimination pass. This pass also > > performs global constant and copy propagation. > > > > Note: When compiling a program using computed gotos, a GCC extension, > > you may get better run-time performance if you disable the global > > common subexpression elimination pass by adding -fno-gcse to the > > command line. > > > > Enabled at levels -O2, -O3, -Os. > > > > ...is available for clang. > > > > I tried with hopping to turn off "global common subexpression elimination": > > > > diff --git a/arch/x86/net/Makefile b/arch/x86/net/Makefile > > index 383c87300b0d..92f934a1e9ff 100644 > > --- a/arch/x86/net/Makefile > > +++ b/arch/x86/net/Makefile > > @@ -3,6 +3,8 @@ > > # Arch-specific network modules > > # > > > > +KBUILD_CFLAGS += -O0 > > This won't work. First, you added to the wrong file. The interpreter > is at kernel/bpf/core.c. > Thanks for the clarification. I mixed up the x86 BPF JIT compiler with the BPF interpreter. I see no diff in the disassembled kernel/bpf/core.o in my clang9-bfd and clang9-lld build-dirs. l$ objdump -M intel -d linux.clang9-bfd/kernel/bpf/core.o > bpf_core_o_clang9-bfd.txt $ objdump -M intel -d linux.clang9-lld/kernel/bpf/core.o > bpf_core_o_clang9-lld.txt --- bpf_core_o_clang9-bfd.txt 2019-07-28 13:11:59.363552042 +0200 +++ bpf_core_o_clang9-lld.txt 2019-07-28 13:12:09.975535278 +0200 @@ -1,5 +1,5 @@ -linux.clang9-bfd/kernel/bpf/core.o: file format elf64-x86-64 +linux.clang9-lld/kernel/bpf/core.o: file format elf64-x86-64 Disassembly of section .text: > Second, kernel may have compilation issues with -O0. > Confirmed. - Sedat - > > + > > ifeq ($(CONFIG_X86_32),y) > > obj-$(CONFIG_BPF_JIT) += bpf_jit_comp32.o > > else > > > > Still see... > > BROKEN: test_bpf: #294 BPF_MAXINSNS: Jump, gap, jump, ... jited:0 > > > > - Sedat - > >