On Tue, Aug 13, 2024 at 11:30 AM Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > On Tue, Aug 13, 2024 at 11:10 AM Andrii Nakryiko > <andrii.nakryiko@xxxxxxxxx> wrote: > > > > On Tue, Aug 13, 2024 at 9:08 AM Alexei Starovoitov > > <alexei.starovoitov@xxxxxxxxx> wrote: > > > > > > On Tue, Aug 13, 2024 at 6:30 AM Jordan Rome <linux@xxxxxxxxxxxxxx> wrote: > > > > > > > > On Tue, Aug 13, 2024 at 6:27 AM Jordan Rome <linux@xxxxxxxxxxxxxx> wrote: > > > > > > > > > > On Mon, Aug 12, 2024 at 10:10 PM Alexei Starovoitov > > > > > <alexei.starovoitov@xxxxxxxxx> wrote: > > > > > > > > > > > > On Mon, Aug 12, 2024 at 6:26 PM Jordan Rome <linux@xxxxxxxxxxxxxx> wrote: > > > > > > > > > > > > > > This adds a kfunc wrapper around strncpy_from_user, > > > > > > > which can be called from sleepable BPF programs. > > > > > > > > > > > > > > This matches the non-sleepable 'bpf_probe_read_user_str' > > > > > > > helper. > > > > > > > > > > > > > > Signed-off-by: Jordan Rome <linux@xxxxxxxxxxxxxx> > > > > > > > --- > > > > > > > kernel/bpf/helpers.c | 36 ++++++++++++++++++++++++++++++++++++ > > > > > > > 1 file changed, 36 insertions(+) > > > > > > > > > > > > > > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c > > > > > > > index d02ae323996b..e87d5df658cb 100644 > > > > > > > --- a/kernel/bpf/helpers.c > > > > > > > +++ b/kernel/bpf/helpers.c > > > > > > > @@ -2939,6 +2939,41 @@ __bpf_kfunc void bpf_iter_bits_destroy(struct bpf_iter_bits *it) > > > > > > > bpf_mem_free(&bpf_global_ma, kit->bits); > > > > > > > } > > > > > > > > > > > > > > +/** > > > > > > > + * bpf_copy_from_user_str() - Copy a string from an unsafe user address > > > > > > > + * @dst: Destination address, in kernel space. This buffer must be at > > > > > > > + * least @dst__szk bytes long. > > > > > > > + * @dst__szk: Maximum number of bytes to copy, including the trailing NUL. > > > > > > > + * @unsafe_ptr__ign: Source address, in user space. > > > > > > > + * > > > > > > > + * Copies a NUL-terminated string from userspace to BPF space. If user string is > > > > > > > + * too long this will still ensure zero termination in the dst buffer unless > > > > > > > + * buffer size is 0. > > > > > > > + */ > > > > > > > +__bpf_kfunc int bpf_copy_from_user_str(void *dst, u32 dst__szk, const void __user *unsafe_ptr__ign) > > > > > > > +{ > > > > > > > + int ret; > > > > > > > + int count; > > > > > > > + > > > > > > > + if (unlikely(!dst__szk)) > > > > > > > + return 0; > > > > > > > + > > > > > > > + count = dst__szk - 1; > > > > > > > + if (unlikely(!count)) { > > > > > > > + ((char *)dst)[0] = '\0'; > > > > > > > + return 1; > > > > > > > + } > > > > > > > + > > > > > > > + ret = strncpy_from_user(dst, unsafe_ptr__ign, count); > > > > > > > + if (ret >= 0) { > > > > > > > + if (ret == count) > > > > > > > + ((char *)dst)[ret] = '\0'; > > > > > > > + ret++; > > > > > > > + } > > > > > > > + > > > > > > > + return ret; > > > > > > > +} > > > > > > > > > > > > The above will not pad the buffer and it will create instability > > > > > > when the target buffer is a part of the map key. Consider: > > > > > > > > > > > > struct map_key { > > > > > > char str[100]; > > > > > > }; > > > > > > struct { > > > > > > __uint(type, BPF_MAP_TYPE_HASH); > > > > > > __type(key, struct map_key); > > > > > > } hash SEC(".maps"); > > > > > > > > > > > > struct map_key key; > > > > > > bpf_copy_from_user_str(key.str, sizeof(key.str), user_string); > > > > > > > > > > > > The verifier will think that all of the 'key' is initialized, > > > > > > but for short strings the key will have garbage. > > > > > > > > > > > > bpf_probe_read_kernel_str() has the same issue as above, but > > > > > > let's fix it here first and update read_kernel_str() later. > > > > > > > > > > > > pw-bot: cr > > > > > > > > > > You're saying we should always do a memset using `dst__szk` on success > > > > > of copying the string? > > > > > > > > Something like this? > > > > ``` > > > > ret = strncpy_from_user(dst, unsafe_ptr__ign, count); > > > > if (ret >= 0) { > > > > if (ret <= count) > > > > memset((char *)dst + ret, 0, dst__szk - ret); > > > > ret++; > > > > } > > > > ``` > > > > > > yep. something like this. I didn't check the math. > > > > I'm a bit worried about this unconditional memset without having a way > > to disable it. In practice, lots of cases won't use the destination > > buffer as a map key, but rather just send it over ringbuf. So paying > > the price of zeroing out seems unnecessary. > > > > It's quite often (I do that in retsnoop, for instance; and we have > > other cases in our production) that we have a pretty big buffer, but > > expect that most of the time strings will be much smaller. So we can > > have a 1K buffer, but get 20 bytes of string content (and we end up > > sending only actual useful size of data over ringbuf/perfbuf, so not > > even paying 1K memcpy() overhead). Paying for memset()'ing the entire > > 1K (and string reading can happen in a loop, so this memsetting will > > be happening over and over, unnecessarily), seems excessive. > > > > Given it's pretty easy to do memset(0) using bpf_prober_read(dst, sz, > > NULL), maybe we shouldn't do memsetting unconditionally? We can add a > > loud comment stating the danger of using the resulting buffer as map > > key without clearing the unfilled part of the buffer and that should > > be sufficient? > > probe_read as memset is a quirk that folks learned to abuse. > Let's add a flag to this bpf_copy_from_user_str() kfunc instead, > so it behaves either like strscpy_pad or strscpy. agreed, a flag sounds good