On Fri, 2023-10-06 at 01:15 +0000, Song Liu wrote: > > On Oct 4, 2023, at 6:50 PM, Song Liu <songliubraving@xxxxxxxx> > > wrote: > > > On Oct 4, 2023, at 5:11 PM, Song Liu <songliubraving@xxxxxxxx> > > > wrote: > > [...] > > > > > > > > > > > > > > diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c > > > > > > > index a8c7e1c5abfa..fd8d4b0addfc 100644 > > > > > > > --- a/kernel/bpf/hashtab.c > > > > > > > +++ b/kernel/bpf/hashtab.c > > > > > > > @@ -155,13 +155,15 @@ static inline int > > > > > > > htab_lock_bucket(const struct bpf_htab *htab, > > > > > > > hash = hash & min_t(u32, HASHTAB_MAP_LOCK_MASK, > > > > > > > htab->n_buckets - 1); > > > > > > > > > > > > > > preempt_disable(); > > > > > > > + local_irq_save(flags); > > > > > > > if (unlikely(__this_cpu_inc_return(*(htab- > > > > > > > >map_locked[hash])) != 1)) { > > > > > > > __this_cpu_dec(*(htab->map_locked[hash])); > > > > > > > + local_irq_restore(flags); > > > > > > > preempt_enable(); > > > > > > > return -EBUSY; > > > > > > > } > > > > > > > > > > > > > > - raw_spin_lock_irqsave(&b->raw_lock, flags); > > > > > > > + raw_spin_lock(&b->raw_lock); > > > > > > > > > > Song, > > > > > > > > > > take a look at s390 crash in BPF CI. > > > > > I suspect this patch is causing it. > > > > > > > > It indeed looks like triggered by this patch. But I haven't > > > > figured > > > > out why it happens. v1 seems ok for the same tests. > > Update my findings today: > > I tried to reproduce the issue locally with qemu on my server > (x86_64). > I got the following artifacts: > > 1. bzImage and selftests from CI: (need to login to GitHub) > https://github.com/kernel-patches/bpf/suites/16885416280/artifacts/964765766 > > 2. cross compiler: > https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/x86_64/13.2.0/x86_64-gcc-13.2.0-nolibc-s390-linux.tar.gz > > 3. root image: > https://libbpf-ci.s3.us-west-1.amazonaws.com/libbpf-vmtest-rootfs-2022.10.23-bullseye-s390x.tar.zst > > With bzImage compiled in CI, I can reproduce the issue with qemu. > However, if I compile the kernel locally with the cross compiler > (with .config from CI, and then olddefconfig), the issue cannot > be reproduced. I have attached the two .config files here. They > look very similar, except the compiler version: > > -CONFIG_CC_VERSION_TEXT="gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0" > +CONFIG_CC_VERSION_TEXT="s390-linux-gcc (GCC) 13.2.0" > > So far, I still think v2 is the right patch. But I really cannot > explain the issue on the bzImage from the CI. I cannot make much > sense out of the s390 assembly code either. (TIL: gdb on my x86 > server can disassem s390 binary). > > Ilya, > > Could you please take a look at this? > > Thanks, > Song > > PS: the root image from the CI is not easy to use. Hopefully you > have something better than that. Hi, Thanks for posting the analysis and links to the artifacts, that saved me quite some time. The crash is caused by a backchain issue in the trampoline code and has nothing to do with your patch; I've posted the fix here [1]. The difference between compilers is most likely caused by different inlining decisions around lookup_elem_raw(). When it's inlined, the test becomes a no-op. Regarding GDB, Debian and Ubuntu have gdb-multiarch package. On Fedora one has to build it from source; the magic binutils-gdb configure flag is --enable-targets=all. Regarding the development environment, in this case I've sidestepped the need for a proper image by putting everything into initramfs: workdir=$(mktemp -d) tar -C "$workdir" -xf libbpf-vmtest-rootfs-2022.10.23-bullseye- s390x.tar.zst rsync -a selftests "$workdir" (cd "$workdir" && find . | cpio -o --format=newc -R root:root) >initrd.cpio qemu-system-s390x -accel kvm -smp 2 -m 4G -kernel kbuild- output/arch/s390/boot/bzImage -nographic -append 'nokaslr console=ttyS1 rdinit=/bin/sh' -initrd initrd.cpio -s For the regular development I have a different setup, with a non-minimal Debian install inside the guest, and the testcases mounted from the host using 9p. Best regards, Ilya [1] https://lore.kernel.org/bpf/20231010203512.385819-1-iii@xxxxxxxxxxxxx/