On Thu, 10 Feb 2022 at 09:35, Daniel Borkmann <daniel@xxxxxxxxxxxxx> wrote: > > On 2/10/22 9:11 AM, Willy Tarreau wrote: > > On Wed, Feb 09, 2022 at 10:08:07PM -0800, syzbot wrote: > >> syzbot has bisected this issue to: > >> > >> commit 7661809d493b426e979f39ab512e3adf41fbcc69 > >> Author: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> > >> Date: Wed Jul 14 16:45:49 2021 +0000 > >> > >> mm: don't allow oversized kvmalloc() calls > >> > >> bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=13bc74c2700000 > >> start commit: f4bc5bbb5fef Merge tag 'nfsd-5.17-2' of git://git.kernel.o.. > >> git tree: upstream > >> final oops: https://syzkaller.appspot.com/x/report.txt?x=107c74c2700000 > >> console output: https://syzkaller.appspot.com/x/log.txt?x=17bc74c2700000 > >> kernel config: https://syzkaller.appspot.com/x/.config?x=5707221760c00a20 > >> dashboard link: https://syzkaller.appspot.com/bug?extid=11421fbbff99b989670e > >> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=12e514a4700000 > >> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=15fcdf8a700000 > >> > >> Reported-by: syzbot+11421fbbff99b989670e@xxxxxxxxxxxxxxxxxxxxxxxxx > >> Fixes: 7661809d493b ("mm: don't allow oversized kvmalloc() calls") > >> > >> For information about bisection process see: https://goo.gl/tpsmEJ#bisection > > > > Interesting, so in fact syzkaller has shown that the aforementioned > > patch does its job well and has spotted a call path by which a single > > userland setsockopt() can request more than 2 GB allocation in the > > kernel. Most likely that's in fact what needs to be addressed. > > > > FWIW the call trace at the URL above is: > > > > Call Trace: > > kvmalloc include/linux/mm.h:806 [inline] > > kvmalloc_array include/linux/mm.h:824 [inline] > > kvcalloc include/linux/mm.h:829 [inline] > > xdp_umem_pin_pages net/xdp/xdp_umem.c:102 [inline] > > xdp_umem_reg net/xdp/xdp_umem.c:219 [inline] > > xdp_umem_create+0x6a5/0xf00 net/xdp/xdp_umem.c:252 > > xsk_setsockopt+0x604/0x790 net/xdp/xsk.c:1068 > > __sys_setsockopt+0x1fd/0x4e0 net/socket.c:2176 > > __do_sys_setsockopt net/socket.c:2187 [inline] > > __se_sys_setsockopt net/socket.c:2184 [inline] > > __x64_sys_setsockopt+0xb5/0x150 net/socket.c:2184 > > do_syscall_x64 arch/x86/entry/common.c:50 [inline] > > do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 > > entry_SYSCALL_64_after_hwframe+0x44/0xae > > > > and the meaningful part of the repro is: > > > > syscall(__NR_mmap, 0x1ffff000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul); > > syscall(__NR_mmap, 0x20000000ul, 0x1000000ul, 7ul, 0x32ul, -1, 0ul); > > syscall(__NR_mmap, 0x21000000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul); > > intptr_t res = 0; > > res = syscall(__NR_socket, 0x2cul, 3ul, 0); > > if (res != -1) > > r[0] = res; > > *(uint64_t*)0x20000080 = 0; > > *(uint64_t*)0x20000088 = 0xfff02000000; > > *(uint32_t*)0x20000090 = 0x800; > > *(uint32_t*)0x20000094 = 0; > > *(uint32_t*)0x20000098 = 0; > > syscall(__NR_setsockopt, r[0], 0x11b, 4, 0x20000080ul, 0x20ul); > > Bjorn had a comment back then when the issue was first raised here: > > https://lore.kernel.org/bpf/3f854ca9-f5d6-4065-c7b1-5e5b25ea742f@xxxxxxxxxxxxx/ > > There was earlier discussion from Andrew to potentially retire the warning: > > https://lore.kernel.org/bpf/20211201202905.b9892171e3f5b9a60f9da251@xxxxxxxxxxxxxxxxxxxx/ > > Bjorn / Magnus / Andrew, anyone planning to follow-up on this issue? > Honestly, I would need some guidance on how to progress. I could just change from U32_MAX to INT_MAX, but as I stated earlier (lore-link above), that has a hacky feeling to it. Andrew's mail didn't really land in a consensus. From my perspective, the code isn't broken, with the memcg limits in consideration. Introducing a LARGE flag or a new "_yes_this_can_be_huge_but_it_is_ok()" version would make sense if this problem is applicable to more users in the kernel. So, thoughts? ;-) Björn