Levi Zim wrote: > On 2024-12-09 15:02, John Fastabend wrote: > > Levi Zim via B4 Relay wrote: > >> From: Levi Zim <rsworktech@xxxxxxxxxxx> > >> > >> bpf kselftest sockhash::test_txmsg_cork_hangs in test_sockmap.c triggers a > >> kernel NULL pointer dereference: > > Is it just the cork test that causes issue? > Yes. More specifically only "sockhash::test_txmsg_cork_hangs" but not > "sockmap::test_txmsg_cork_hangs" > > > >> BUG: kernel NULL pointer dereference, address: 0000000000000008 > >> ? __die_body+0x6e/0xb0 > >> ? __die+0x8b/0xa0 > >> ? page_fault_oops+0x358/0x3c0 > >> ? local_clock+0x19/0x30 > >> ? lock_release+0x11b/0x440 > >> ? kernelmode_fixup_or_oops+0x54/0x60 > >> ? __bad_area_nosemaphore+0x4f/0x210 > >> ? mmap_read_unlock+0x13/0x30 > >> ? bad_area_nosemaphore+0x16/0x20 > >> ? do_user_addr_fault+0x6fd/0x740 > >> ? prb_read_valid+0x1d/0x30 > >> ? exc_page_fault+0x55/0xd0 > >> ? asm_exc_page_fault+0x2b/0x30 > >> ? splice_to_socket+0x52e/0x630 > >> ? shmem_file_splice_read+0x2b1/0x310 > >> direct_splice_actor+0x47/0x70 > >> splice_direct_to_actor+0x133/0x300 > >> ? do_splice_direct+0x90/0x90 > >> do_splice_direct+0x64/0x90 > >> ? __ia32_sys_tee+0x30/0x30 > >> do_sendfile+0x214/0x300 > >> __se_sys_sendfile64+0x8e/0xb0 > >> __x64_sys_sendfile64+0x25/0x30 > >> x64_sys_call+0xb82/0x2840 > >> do_syscall_64+0x75/0x110 > >> entry_SYSCALL_64_after_hwframe+0x4b/0x53 > >> > >> This is caused by tcp_bpf_sendmsg() returning a larger value(12289) than > >> size (8192), which causes the while loop in splice_to_socket() to release > >> an uninitialized pipe buf. > >> > >> The underlying cause is that this code assumes sk_msg_memcopy_from_iter() > >> will copy all bytes upon success but it actually might only copy part of > >> it. > > The intent was to ensure we allocate a buffer large enough to fit the > > data. I guess the cork + send here is not allocating enough bytes? > I am not familiar enough with neither this part of code nor tcp with bpf > in general and just > hit this bug when trying to run the bpf kselftests. Then I decided to > debug it. > > In my perspective the buffer(8192) is large enough to hold the data(8192), > but tcp_bpf_sendmsg returns 12289 which is a little surprising for me. > > Could you further elaborate why 8192 bytes are not enough? Thanks! > There is some bug in the buffer allocation sizing that is happening because of cork'd data. The cork logic is used to hold extra bytes in buffer until N bytes have been received. I'm not really opposed to the fix here, but would be good to understand how it got here. I have some time tommorrow I can look a bit more.