Re: BUG: unable to handle kernel paging request in bpf_dispatcher_xdp

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/9/22 10:53 PM, Jiri Olsa wrote:
On Fri, Dec 09, 2022 at 12:31:06PM -0800, Yonghong Song wrote:


On 12/9/22 7:20 AM, Jiri Olsa wrote:
On Fri, Dec 09, 2022 at 02:50:55PM +0100, Jiri Olsa wrote:
On Fri, Dec 09, 2022 at 12:22:37PM +0100, Jiri Olsa wrote:

SBIP


I'm trying to understand the severity of the issues and
whether we need to revert that commit asap since the merge window
is about to start.

Jiri, Peter,

ping.

cc-ing Thorsten, since he's tracking it now.

The config has CONFIG_X86_KERNEL_IBT=y.
Is it related?

sorry for late reply.. I still did not find the reason,
but I did not try with IBT yet, will test now

no difference with IBT enabled, can't reproduce the issue


ok, scratch that.. the reproducer got stuck on wifi init :-\

after I fix that I can now reproduce on my local config with
IBT enabled or disabled.. it's something else

I'm getting the error also when reverting the static call change,
looking for good commit, bisecting

I'm getting fail with:
     f0c4d9fc9cc9 (tag: v6.1-rc4) Linux 6.1-rc4

v6.1-rc1 is ok

so far I narrowed it down between rc1 and rc3.. bisect got me nowhere so far

attaching some more logs

looking at the code.. how do we ensure that code running through
bpf_prog_run_xdp will not get dispatcher image changed while
it's being exetuted

we use 'the other half' of the image when we add/remove programs,
but could bpf_dispatcher_update race with bpf_prog_run_xdp like:


cpu 0:                                  cpu 1:

bpf_prog_run_xdp
     ...
     bpf_dispatcher_xdp_func
        start exec image at offset 0x0

                                          bpf_dispatcher_update
                                                  update image at offset 0x800
                                          bpf_dispatcher_update
                                                  update image at offset 0x0

        still in image at offset 0x0


that might explain why I wasn't able to trigger that on
bare metal just in qemu

I tried patch below and it fixes the issue for me and seems
to confirm the race above.. but not sure it's the best fix

jirka


---
diff --git a/kernel/bpf/dispatcher.c b/kernel/bpf/dispatcher.c
index c19719f48ce0..6a2ced102fc7 100644
--- a/kernel/bpf/dispatcher.c
+++ b/kernel/bpf/dispatcher.c
@@ -124,6 +124,7 @@ static void bpf_dispatcher_update(struct bpf_dispatcher *d, int prev_num_progs)
   	}
   	__BPF_DISPATCHER_UPDATE(d, new ?: (void *)&bpf_dispatcher_nop_func);
+	synchronize_rcu_tasks();
   	if (new)
   		d->image_off = noff;

This might work. In arch/x86/kernel/alternative.c, we have following
code and comments. For text_poke, synchronize_rcu_tasks() might be able
to avoid concurrent execution and update.

so my idea was that we need to ensure all the current callers of
bpf_dispatcher_xdp_func (which should have rcu read lock, based
on the comment in bpf_prog_run_xdp) are gone before and new ones
execute the new image, so the next call to the bpf_dispatcher_update
will be safe to overwrite the other half of the image

If v6.1-rc1 was indeed okay, then it looks like this may be related to
the trampoline patching for the static_call? Did it repro on v6.1-rc1
just with dbe69b299884 ("bpf: Fix dispatcher patchable function entry
to 5 bytes nop") cherry-picked?

/**
  * text_poke_copy - Copy instructions into (an unused part of) RX memory
  * @addr: address to modify
  * @opcode: source of the copy
  * @len: length to copy, could be more than 2x PAGE_SIZE
  *
  * Not safe against concurrent execution; useful for JITs to dump
  * new code blocks into unused regions of RX memory. Can be used in
  * conjunction with synchronize_rcu_tasks() to wait for existing
  * execution to quiesce after having made sure no existing functions
  * pointers are live.
  */
void *text_poke_copy(void *addr, const void *opcode, size_t len)
{
         unsigned long start = (unsigned long)addr;
         size_t patched = 0;

         if (WARN_ON_ONCE(core_kernel_text(start)))
                 return NULL;

         mutex_lock(&text_mutex);
         while (patched < len) {
                 unsigned long ptr = start + patched;
                 size_t s;

                 s = min_t(size_t, PAGE_SIZE * 2 - offset_in_page(ptr), len -
patched);

                 __text_poke(text_poke_memcpy, (void *)ptr, opcode + patched,
s);
                 patched += s;
         }
         mutex_unlock(&text_mutex);
         return addr;
}




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux