On Fri, Dec 09, 2022 at 12:31:06PM -0800, Yonghong Song wrote: > > > On 12/9/22 7:20 AM, Jiri Olsa wrote: > > On Fri, Dec 09, 2022 at 02:50:55PM +0100, Jiri Olsa wrote: > > > On Fri, Dec 09, 2022 at 12:22:37PM +0100, Jiri Olsa wrote: > > > > > > SBIP > > > > > > > > > > > > > > > > > > > > > > > I'm trying to understand the severity of the issues and > > > > > > > > > > whether we need to revert that commit asap since the merge window > > > > > > > > > > is about to start. > > > > > > > > > > > > > > > > > > Jiri, Peter, > > > > > > > > > > > > > > > > > > ping. > > > > > > > > > > > > > > > > > > cc-ing Thorsten, since he's tracking it now. > > > > > > > > > > > > > > > > > > The config has CONFIG_X86_KERNEL_IBT=y. > > > > > > > > > Is it related? > > > > > > > > > > > > > > > > sorry for late reply.. I still did not find the reason, > > > > > > > > but I did not try with IBT yet, will test now > > > > > > > > > > > > > > no difference with IBT enabled, can't reproduce the issue > > > > > > > > > > > > > > > > > > > ok, scratch that.. the reproducer got stuck on wifi init :-\ > > > > > > > > > > > > after I fix that I can now reproduce on my local config with > > > > > > IBT enabled or disabled.. it's something else > > > > > > > > > > I'm getting the error also when reverting the static call change, > > > > > looking for good commit, bisecting > > > > > > > > > > I'm getting fail with: > > > > > f0c4d9fc9cc9 (tag: v6.1-rc4) Linux 6.1-rc4 > > > > > > > > > > v6.1-rc1 is ok > > > > > > > > so far I narrowed it down between rc1 and rc3.. bisect got me nowhere so far > > > > > > > > attaching some more logs > > > > > > looking at the code.. how do we ensure that code running through > > > bpf_prog_run_xdp will not get dispatcher image changed while > > > it's being exetuted > > > > > > we use 'the other half' of the image when we add/remove programs, > > > but could bpf_dispatcher_update race with bpf_prog_run_xdp like: > > > > > > > > > cpu 0: cpu 1: > > > > > > bpf_prog_run_xdp > > > ... > > > bpf_dispatcher_xdp_func > > > start exec image at offset 0x0 > > > > > > bpf_dispatcher_update > > > update image at offset 0x800 > > > bpf_dispatcher_update > > > update image at offset 0x0 > > > > > > still in image at offset 0x0 > > > > > > > > > that might explain why I wasn't able to trigger that on > > > bare metal just in qemu > > > > I tried patch below and it fixes the issue for me and seems > > to confirm the race above.. but not sure it's the best fix > > > > jirka > > > > > > --- > > diff --git a/kernel/bpf/dispatcher.c b/kernel/bpf/dispatcher.c > > index c19719f48ce0..6a2ced102fc7 100644 > > --- a/kernel/bpf/dispatcher.c > > +++ b/kernel/bpf/dispatcher.c > > @@ -124,6 +124,7 @@ static void bpf_dispatcher_update(struct bpf_dispatcher *d, int prev_num_progs) > > } > > __BPF_DISPATCHER_UPDATE(d, new ?: (void *)&bpf_dispatcher_nop_func); > > + synchronize_rcu_tasks(); > > if (new) > > d->image_off = noff; > > This might work. In arch/x86/kernel/alternative.c, we have following > code and comments. For text_poke, synchronize_rcu_tasks() might be able > to avoid concurrent execution and update. so my idea was that we need to ensure all the current callers of bpf_dispatcher_xdp_func (which should have rcu read lock, based on the comment in bpf_prog_run_xdp) are gone before and new ones execute the new image, so the next call to the bpf_dispatcher_update will be safe to overwrite the other half of the image jirka > > /** > * text_poke_copy - Copy instructions into (an unused part of) RX memory > * @addr: address to modify > * @opcode: source of the copy > * @len: length to copy, could be more than 2x PAGE_SIZE > * > * Not safe against concurrent execution; useful for JITs to dump > * new code blocks into unused regions of RX memory. Can be used in > * conjunction with synchronize_rcu_tasks() to wait for existing > * execution to quiesce after having made sure no existing functions > * pointers are live. > */ > void *text_poke_copy(void *addr, const void *opcode, size_t len) > { > unsigned long start = (unsigned long)addr; > size_t patched = 0; > > if (WARN_ON_ONCE(core_kernel_text(start))) > return NULL; > > mutex_lock(&text_mutex); > while (patched < len) { > unsigned long ptr = start + patched; > size_t s; > > s = min_t(size_t, PAGE_SIZE * 2 - offset_in_page(ptr), len - > patched); > > __text_poke(text_poke_memcpy, (void *)ptr, opcode + patched, > s); > patched += s; > } > mutex_unlock(&text_mutex); > return addr; > }