Re: [PATCH v2 bpf-next] bpf: Avoid unnecessary -EBUSY from htab_lock_bucket

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> On Oct 4, 2023, at 6:50 PM, Song Liu <songliubraving@xxxxxxxx> wrote:
>> On Oct 4, 2023, at 5:11 PM, Song Liu <songliubraving@xxxxxxxx> wrote:

[...]

>>>>>
>>>>>> diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
>>>>>> index a8c7e1c5abfa..fd8d4b0addfc 100644
>>>>>> --- a/kernel/bpf/hashtab.c
>>>>>> +++ b/kernel/bpf/hashtab.c
>>>>>> @@ -155,13 +155,15 @@ static inline int htab_lock_bucket(const struct bpf_htab *htab,
>>>>>>     hash = hash & min_t(u32, HASHTAB_MAP_LOCK_MASK, htab->n_buckets - 1);
>>>>>>
>>>>>>     preempt_disable();
>>>>>> +       local_irq_save(flags);
>>>>>>     if (unlikely(__this_cpu_inc_return(*(htab->map_locked[hash])) != 1)) {
>>>>>>             __this_cpu_dec(*(htab->map_locked[hash]));
>>>>>> +               local_irq_restore(flags);
>>>>>>             preempt_enable();
>>>>>>             return -EBUSY;
>>>>>>     }
>>>>>>
>>>>>> -       raw_spin_lock_irqsave(&b->raw_lock, flags);
>>>>>> +       raw_spin_lock(&b->raw_lock);
>>>>
>>>> Song,
>>>>
>>>> take a look at s390 crash in BPF CI.
>>>> I suspect this patch is causing it.
>>>
>>> It indeed looks like triggered by this patch. But I haven't figured
>>> out why it happens. v1 seems ok for the same tests.

Update my findings today:

I tried to reproduce the issue locally with qemu on my server (x86_64).
I got the following artifacts:

1. bzImage and selftests from CI: (need to login to GitHub)
https://github.com/kernel-patches/bpf/suites/16885416280/artifacts/964765766

2. cross compiler:
https://mirrors.edge.kernel.org/pub/tools/crosstool/files/bin/x86_64/13.2.0/x86_64-gcc-13.2.0-nolibc-s390-linux.tar.gz

3. root image:
https://libbpf-ci.s3.us-west-1.amazonaws.com/libbpf-vmtest-rootfs-2022.10.23-bullseye-s390x.tar.zst

With bzImage compiled in CI, I can reproduce the issue with qemu.
However, if I compile the kernel locally with the cross compiler
(with .config from CI, and then olddefconfig), the issue cannot
be reproduced. I have attached the two .config files here. They
look very similar, except the compiler version:

-CONFIG_CC_VERSION_TEXT="gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0"
+CONFIG_CC_VERSION_TEXT="s390-linux-gcc (GCC) 13.2.0"

So far, I still think v2 is the right patch. But I really cannot
explain the issue on the bzImage from the CI. I cannot make much
sense out of the s390 assembly code either. (TIL: gdb on my x86
server can disassem s390 binary).

Ilya,

Could you please take a look at this?

Thanks,
Song

PS: the root image from the CI is not easy to use. Hopefully you
have something better than that.






Attachment: ci.config
Description: ci.config

Attachment: local.config
Description: local.config


[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux