Re: [PATCH bpf-next 1/2] bpf: Add kfuncs for read-only string operations

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/3/24 19:02, Alexei Starovoitov wrote:
> On Wed, Oct 2, 2024 at 9:51 PM Viktor Malik <vmalik@xxxxxxxxxx> wrote:
>>
>> On 10/2/24 18:55, Alexei Starovoitov wrote:
>>> On Tue, Oct 1, 2024 at 11:12 PM Viktor Malik <vmalik@xxxxxxxxxx> wrote:
>>>>
>>>> On 10/1/24 19:40, Andrii Nakryiko wrote:
>>>>> On Tue, Oct 1, 2024 at 10:34 AM Alexei Starovoitov
>>>>> <alexei.starovoitov@xxxxxxxxx> wrote:
>>>>>>
>>>>>> On Tue, Oct 1, 2024 at 10:04 AM Andrii Nakryiko
>>>>>> <andrii.nakryiko@xxxxxxxxx> wrote:
>>>>>>>
>>>>>>> On Tue, Oct 1, 2024 at 7:48 AM Alexei Starovoitov
>>>>>>> <alexei.starovoitov@xxxxxxxxx> wrote:
>>>>>>>>
>>>>>>>> On Tue, Oct 1, 2024 at 4:26 AM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote:
>>>>>>>>>
>>>>>>>>> On Mon, 2024-09-30 at 15:00 -0700, Andrii Nakryiko wrote:
>>>>>>>>>
>>>>>>>>> [...]
>>>>>>>>>
>>>>>>>>>> Right now, the only way to pass dynamically sized anything is through
>>>>>>>>>> dynptr, AFAIU.
>>>>>>>>>
>>>>>>>>> But we do have 'is_kfunc_arg_mem_size()' that checks for __sz suffix,
>>>>>>>>> e.g. used for bpf_copy_from_user_str():
>>>>>>>>>
>>>>>>>>> /**
>>>>>>>>>  * bpf_copy_from_user_str() - Copy a string from an unsafe user address
>>>>>>>>>  * @dst:             Destination address, in kernel space.  This buffer must be
>>>>>>>>>  *                   at least @dst__sz bytes long.
>>>>>>>>>  * @dst__sz:         Maximum number of bytes to copy, includes the trailing NUL.
>>>>>>>>>  * ...
>>>>>>>>>  */
>>>>>>>>> __bpf_kfunc int bpf_copy_from_user_str(void *dst, u32 dst__sz, const void __user *unsafe_ptr__ign, u64 flags)
>>>>>>>>>
>>>>>>>>> However, this suffix won't work for strnstr because of the arguments order.
>>>>>>>>
>>>>>>>> Stating the obvious... we don't need to keep the order exactly the same.
>>>>>>>>
>>>>>>>> Regarding all of these kfuncs... as Andrii pointed out 'const char *s'
>>>>>>>> means that the verifier will check that 's' points to a valid byte.
>>>>>>>> I think we can do a hybrid static + dynamic safety scheme here.
>>>>>>>> All of the kfunc signatures can stay the same, but we'd have to
>>>>>>>> open code all string helpers with __get_kernel_nofault() instead of
>>>>>>>> direct memory access.
>>>>>>>> Since the first byte is guaranteed to be valid by the verifier
>>>>>>>> we only need to make sure that the s+N bytes won't cause page faults
>>>>>>>
>>>>>>> You mean to just check that s[N-1] can be read? Given a large enough
>>>>>>> N, couldn't it be that some page between s[0] and s[N-1] still can be
>>>>>>> unmapped, defeating this check?
>>>>>>
>>>>>> Just checking s[0] and s[N-1] is not enough, obviously, and especially,
>>>>>> since the logic won't know where nul byte is, so N is unknown.
>>>>>> I meant to that all of str* kfuncs will be reading all bytes
>>>>>> via __get_kernel_nofault() until they find \0.
>>>>>
>>>>> Ah, ok, I see what you mean now.
>>>>>
>>>>>> It can be optimized to 8 byte access.
>>>>>> The open coding (aka copy-paste) is unfortunate, of course.
>>>>>
>>>>> Yep, this sucks.
>>>>
>>>> Yeah, that's quite annoying. I really wanted to avoid doing that. Also,
>>>> we won't be able to use arch-optimized versions of the functions.
>>>>
>>>> Just to make sure I understand things correctly - can we do what Eduard
>>>> suggested and add explicit sizes for all arguments using the __sz
>>>> suffix? So something like:
>>>>
>>>>     const char *bpf_strnstr(const char *s1, u32 s1__sz, const char *s2, u32 s2__sz);
>>>
>>> That's ok-ish, but you probably want:
>>>
>>> const char *bpf_strnstr(void *s1, u32 s1__sz, void *s2, u32 s2__sz);
>>>
>>> and then to call strnstr() you still need to strnlen(s2, s2__sz).
>>>
>>> But a more general question... how always passing size will work
>>> for bpftrace ? Does it always know the upper bound of storage where
>>> strings are stored?
>>
>> Yes, it does. The strings must be read via the str() call (which
>> internally calls bpf_probe_read_str) and there's an upper bound on the
>> size of each string.
> 
> which sounds like a bpftrace current limitation and not something to
> depend on from kfunc design pov.
> Wouldn't you want strings to be arbitrary length?

Sure, there's just a lot of work to be done on that front...

But I agree, it shouldn't be a limitation for the kfuncs. I just wanted
to point out that if we agreed to only create kfuncs for bounded
functions, bpftrace use-case would be (at least for the moment) covered.

Anyways, it seems to me that both the bounded and the unbounded versions
have their place. Would it be ok with you to open-code just the
unbounded ones and call in-kernel implementations for the bounded ones?
Or would you prefer to have everything unified (i.e. open-coded)?

Also, just out of curiosity, what are the ways to create/obtain strings
of unbounded length in BPF programs? Arguments of BTF-enabled program
types (like fentry)? Any other examples? Because IIUC, when you read
strings from kernel/userspace memory using bpf_probe_read_str, you
always need to specify the size.

> 
>>> I would think __get_kernel_nofault() approach is user friendlier.
>>
>> That's probably true but isn't there still the problem that strings are
>> not necessarily null-terminated? And in such case, unbounded string
>> functions may not terminate which is not allowed in BPF?
> 
> kfuncs that are searching for nul via loop with __get_kernel_nofault()
> would certainly need an upper bound.
> Something like PATH_MAX (4k) or XATTR_SIZE_MAX (64k)
> would cover 99.99% of use cases.

Right, makes sense.

Thanks!
Viktor

> 





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux