On Wed, Aug 16, 2023 at 11:35 PM Hou Tao <houtao@xxxxxxxxxxxxxxx> wrote: > > ping ? Sorry for the delay. I've been on PTO. > On 8/3/2023 9:28 PM, Hou Tao wrote: > > > > On 8/3/2023 9:19 PM, Hou Tao wrote: > >> Hi, > >> > >> I am preparing for qp-trie v4, but I need some help on how to support > >> variable-sized key in bpf syscall. The implementation of qp-trie needs > >> to distinguish between dynptr key from bpf program and variable-sized > >> key from bpf syscall. In v3, I added a new dynptr type: > >> BPF_DYNPTR_TYPE_USER for variable-sized key from bpf syscall [0], so > >> both bpf program and bpf syscall will use the same type to represent the > >> variable-sized key, but Andrii thought ptr+size tuple was simpler and > >> would be enough for user APIs, so in v4, the type of key for bpf program > >> and syscall will be different. One way to handle that is to add a new > >> parameter in .map_lookup_elem()/.map_delete_elem()/.map_update_elem() to > >> tell whether the key comes from bpf program or syscall or introduce new > >> APIs in bpf_map_ops for variable-sized key related syscall, but I think > >> it will introduce too much churn. Considering that the size of > >> bpf_dynptr_kern is 8-bytes aligned, so I think maybe I could reuse the > >> lowest 1-bit of key pointer to tell qp-trie whether or not it is a > >> bpf_dynptr_kern or a variable-sized key pointer from syscall. For > >> bpf_dynptr_kern, because it is 8B-aligned, so its lowest bit must be 0, > >> and for variable-sized key from syscall, I could allocated a 4B-aligned > >> pointer and setting the lowest bit as 1, so qp-trie can distinguish > >> between these two types of pointer. The question is that I am not sure > >> whether the idea above is a good one or not. Does it sound fragile ? Or > >> is there any better way to handle that ? Let's avoid bit hacks. They're not extensible and should be used only in cases where performance matters a lot or memory constraints are extreme. ptr/sz tuple from syscall side sounds the simplest. I agree with Andrii exposing the dynptr concept to user space and especially as part of syscall is unnecessary. We already have LPM as a precedent. Maybe we can do the same here? No need to add new sys_bpf commands. If the existing bpf_map_lookup_elem() helper doesn't fit qp-tree we can use new kfuncs from bpf prog and LPM-like map accessors from syscall.