On Thu, 21 Dec 2023, Jiri Olsa wrote: > On Thu, Dec 21, 2023 at 10:17:44AM +0000, Lee Jones wrote: > > On Thu, 21 Dec 2023, Greg KH wrote: > > > > > On Thu, Dec 21, 2023 at 09:55:22AM +0000, Lee Jones wrote: > > > > On Thu, 21 Dec 2023, Greg KH wrote: > > > > > > > > > On Thu, Dec 21, 2023 at 09:07:45AM +0000, Lee Jones wrote: > > > > > > Dear Stable, > > > > > > > > > > > > > Lee pointed out issue found by syscaller [0] hitting BUG in prog array > > > > > > > map poke update in prog_array_map_poke_run function due to error value > > > > > > > returned from bpf_arch_text_poke function. > > > > > > > > > > > > > > There's race window where bpf_arch_text_poke can fail due to missing > > > > > > > bpf program kallsym symbols, which is accounted for with check for > > > > > > > -EINVAL in that BUG_ON call. > > > > > > > > > > > > > > The problem is that in such case we won't update the tail call jump > > > > > > > and cause imbalance for the next tail call update check which will > > > > > > > fail with -EBUSY in bpf_arch_text_poke. > > > > > > > > > > > > > > I'm hitting following race during the program load: > > > > > > > > > > > > > > CPU 0 CPU 1 > > > > > > > > > > > > > > bpf_prog_load > > > > > > > bpf_check > > > > > > > do_misc_fixups > > > > > > > prog_array_map_poke_track > > > > > > > > > > > > > > map_update_elem > > > > > > > bpf_fd_array_map_update_elem > > > > > > > prog_array_map_poke_run > > > > > > > > > > > > > > bpf_arch_text_poke returns -EINVAL > > > > > > > > > > > > > > bpf_prog_kallsyms_add > > > > > > > > > > > > > > After bpf_arch_text_poke (CPU 1) fails to update the tail call jump, the next > > > > > > > poke update fails on expected jump instruction check in bpf_arch_text_poke > > > > > > > with -EBUSY and triggers the BUG_ON in prog_array_map_poke_run. > > > > > > > > > > > > > > Similar race exists on the program unload. > > > > > > > > > > > > > > Fixing this by moving the update to bpf_arch_poke_desc_update function which > > > > > > > makes sure we call __bpf_arch_text_poke that skips the bpf address check. > > > > > > > > > > > > > > Each architecture has slightly different approach wrt looking up bpf address > > > > > > > in bpf_arch_text_poke, so instead of splitting the function or adding new > > > > > > > 'checkip' argument in previous version, it seems best to move the whole > > > > > > > map_poke_run update as arch specific code. > > > > > > > > > > > > > > [0] https://syzkaller.appspot.com/bug?extid=97a4fe20470e9bc30810 > > > > > > > > > > > > > > Cc: Lee Jones <lee@xxxxxxxxxx> > > > > > > > Cc: Maciej Fijalkowski <maciej.fijalkowski@xxxxxxxxx> > > > > > > > Fixes: ebf7d1f508a7 ("bpf, x64: rework pro/epilogue and tailcall handling in JIT") > > > > > > > Reported-by: syzbot+97a4fe20470e9bc30810@xxxxxxxxxxxxxxxxxxxxxxxxx > > > > > > > Acked-by: Yonghong Song <yonghong.song@xxxxxxxxx> > > > > > > > Signed-off-by: Jiri Olsa <jolsa@xxxxxxxxxx> > > > > > > > --- > > > > > > > arch/x86/net/bpf_jit_comp.c | 46 +++++++++++++++++++++++++++++ > > > > > > > include/linux/bpf.h | 3 ++ > > > > > > > kernel/bpf/arraymap.c | 58 +++++++------------------------------ > > > > > > > 3 files changed, 59 insertions(+), 48 deletions(-) > > > > > > > > > > > > Please could we have this backported? > > > > > > > > > > > > Guided by the Fixes: tag. > > > > > > > > > > <formletter> > > > > > > > > > > This is not the correct way to submit patches for inclusion in the > > > > > stable kernel tree. Please read: > > > > > https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html > > > > > for how to do this properly. > > > > > > > > > > </formletter> > > > > > > > > Apologies. > > > > > > > > Commit ID: 4b7de801606e504e69689df71475d27e35336fb3 > > > > Subject: bpf: Fix prog_array_map_poke_run map poke update > > > > Reason: Fixes a race condition in BPF. > > > > Versions: linux-5.10.y+, as specified by the Fixes: tag above > > > > > > Did not apply to 5.10.y or 5.15.y, so if you need/want it there, we will > > > need a working backport that has been tested. Other trees it's now > > > queued up for. > > > > Thank you. > > please let me know if you need any help with that, I can check on that I absolutely do. I have no way to test BPF. -- Lee Jones [李琼斯]