On Tue, Oct 1, 2024 at 10:04 AM Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> wrote: > > On Tue, Oct 1, 2024 at 7:48 AM Alexei Starovoitov > <alexei.starovoitov@xxxxxxxxx> wrote: > > > > On Tue, Oct 1, 2024 at 4:26 AM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote: > > > > > > On Mon, 2024-09-30 at 15:00 -0700, Andrii Nakryiko wrote: > > > > > > [...] > > > > > > > Right now, the only way to pass dynamically sized anything is through > > > > dynptr, AFAIU. > > > > > > But we do have 'is_kfunc_arg_mem_size()' that checks for __sz suffix, > > > e.g. used for bpf_copy_from_user_str(): > > > > > > /** > > > * bpf_copy_from_user_str() - Copy a string from an unsafe user address > > > * @dst: Destination address, in kernel space. This buffer must be > > > * at least @dst__sz bytes long. > > > * @dst__sz: Maximum number of bytes to copy, includes the trailing NUL. > > > * ... > > > */ > > > __bpf_kfunc int bpf_copy_from_user_str(void *dst, u32 dst__sz, const void __user *unsafe_ptr__ign, u64 flags) > > > > > > However, this suffix won't work for strnstr because of the arguments order. > > > > Stating the obvious... we don't need to keep the order exactly the same. > > > > Regarding all of these kfuncs... as Andrii pointed out 'const char *s' > > means that the verifier will check that 's' points to a valid byte. > > I think we can do a hybrid static + dynamic safety scheme here. > > All of the kfunc signatures can stay the same, but we'd have to > > open code all string helpers with __get_kernel_nofault() instead of > > direct memory access. > > Since the first byte is guaranteed to be valid by the verifier > > we only need to make sure that the s+N bytes won't cause page faults > > You mean to just check that s[N-1] can be read? Given a large enough > N, couldn't it be that some page between s[0] and s[N-1] still can be > unmapped, defeating this check? Just checking s[0] and s[N-1] is not enough, obviously, and especially, since the logic won't know where nul byte is, so N is unknown. I meant to that all of str* kfuncs will be reading all bytes via __get_kernel_nofault() until they find \0. It can be optimized to 8 byte access. The open coding (aka copy-paste) is unfortunate, of course.