Re: [PATCH bpf-next v2 03/13] bpf: Support bpf_dynptr-typed map key in bpf syscall

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Sep 28, 2022 at 7:11 PM Hou Tao <houtao@xxxxxxxxxxxxxxx> wrote:
>
> Hi,
>
> On 9/29/2022 8:16 AM, Andrii Nakryiko wrote:
> > On Sat, Sep 24, 2022 at 6:18 AM Hou Tao <houtao@xxxxxxxxxxxxxxx> wrote:
> >> From: Hou Tao <houtao1@xxxxxxxxxx>
> >>
> >> Userspace application uses bpf syscall to lookup or update bpf map. It
> >> passes a pointer of fixed-size buffer to kernel to represent the map
> >> key. To support map with variable-length key, introduce bpf_dynptr_user
> >> to allow userspace to pass a pointer of bpf_dynptr_user to specify the
> >> address and the length of key buffer. And in order to represent dynptr
> >> from userspace, adding a new dynptr type: BPF_DYNPTR_TYPE_USER. Because
> >> BPF_DYNPTR_TYPE_USER-typed dynptr is not available from bpf program, so
> >> no verifier update is needed.
> >>
> >> Add dynptr_key_off in bpf_map to distinguish map with fixed-size key
> >> from map with variable-length. dynptr_key_off is less than zero for
> >> fixed-size key and can only be zero for dynptr key.
> >>
> >> For dynptr-key map, key btf type is bpf_dynptr and key size is 16, so
> >> use the lower 32-bits of map_extra to specify the maximum size of dynptr
> >> key.
> >>
> >> Signed-off-by: Hou Tao <houtao1@xxxxxxxxxx>
> >> ---
> > This is a great feature and you've put lots of high-quality work into
> > this! Looking forward to have qp-trie BPF map available. Apart from
> > your discussion with Alexie about locking and memory
> > allocation/reused, I have questions about this dynptr from user-space
> > interface. Let's discuss it in this patch to not interfere.
> >
> > I'm trying to understand why there should be so many new concepts and
> > interfaces just to allow variable-sized keys. Can you elaborate on
> > that? Like why do we even need BPF_DYNPTR_TYPE_USER? Why user can't
> > just pass a void * (casted to u64) pointer and size of the memory
> > pointed to it, and kernel will just copy necessary amount of data into
> > kvmalloc'ed temporary region?
> The main reason is that map operations from syscall and bpf program use the same
> ops in bpf_map_ops (e.g. map_update_elem). If only use dynptr_kern for bpf
> program, then
> have to define three new operations for bpf program. Even more, after defining
> two different map ops for the same operation from syscall and bpf program, the
> internal  implementation of qp-trie still need to convert these two different
> representations of variable-length key into bpf_qp_trie_key. It introduces
> unnecessary conversion, so I think it may be a good idea to pass dynptr_kern to
> qp-trie even for bpf syscall.
>
> And now in bpf_attr, for BPF_MAP_*_ELEM command, there is no space to pass an
> extra key size. It seems bpf_attr can be extend, but even it is extented, it
> also means in libbpf we need to provide a new API group to support operationg on
> dynptr key map, because the userspace needs to pass the key size as a new argument.


You are right that the current assumption of implicit key/value size
doesn't work for these variable-key/value-length maps. But I think the
right answer is actually to make sure that we have a map_update_elem
callback variant that accepts key/value size explicitly. I still think
that the syscall interface shouldn't introduce a concept of dynptr.
>From user-space's point of view dynptr is just a memory pointer +
associated memory size. Let's keep it simple. And yes, it will be a
new libbpf API for bpf_map_lookup_elem/bpf_map_update_elem. That's
fine.


> >
> > It also seems like you want to allow key (and maybe value as well, not
> > sure) to be a custom user-defined type where some of the fields are
> > struct bpf_dynptr. I think it's a big overcomplication, tbh. I'd say
> > it's enough to just say that entire key has to be described by a
> > single bpf_dynptr. Then we can have bpf_map_lookup_elem_dynptr(map,
> > key_dynptr, flags) new helper to provide variable-sized key for
> > lookup.
> For qp-trie, it will only support a single dynptr as the map key. In the future
> maybe other map will support map key with embedded dynptrs. Maybe Joanne can
> share some vision about such use case.

My point was that instead of saying that key is some fixed-size struct
in which one of the fields is dynptr (and then when comparing you have
to compare part of struct, then dynptr contents, then the other part
of struct?), just say that entire key is represented by dynptr,
implicitly (it's just a blob of bytes). That seems more
straightforward.

> >
> > I think it would keep it much simpler. But if I'm missing something,
> > it would be good to understand that. Thanks!
> >
> >
> >>  include/linux/bpf.h            |   8 +++
> >>  include/uapi/linux/bpf.h       |   6 ++
> >>  kernel/bpf/map_in_map.c        |   3 +
> >>  kernel/bpf/syscall.c           | 121 +++++++++++++++++++++++++++------
> >>  tools/include/uapi/linux/bpf.h |   6 ++
> >>  5 files changed, 125 insertions(+), 19 deletions(-)
> >>
> > [...]
> > .
>



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux