Hi Andrii, On 8/26/2023 2:33 AM, Andrii Nakryiko wrote: > On Tue, Aug 22, 2023 at 6:12 AM Hou Tao <houtao@xxxxxxxxxxxxxxx> wrote: SNIP >>>> Yes. bpf prog will use dynptr as the map key. The bpf program will use >>>> the same map helpers as hash map to operate on qp-trie and the verifier >>>> will be updated to allow using dynptr as map key for qp-trie. >>> And that's the problem I just mentioned. >>> PTR_TO_MAP_KEY is special. I don't think we should hack it to also >>> mean ARG_PTR_TO_DYNPTR depending on the first argument (map type). >> Sorry for misunderstanding your reply. But before switch to the kfunc >> way, could you please point me to some code or function which shows the >> specialty of PTR_MAP_KEY ? >> >> > Search in kernel/bpf/verifier.c how PTR_TO_MAP_KEY is handled. The > logic assumes that there is associated struct bpf_map * pointer from > which we know fixed-sized key length. Thanks for the information. Will check that. > > But getting back to the topic at hand. I vaguely remember discussion > we had, but it would be good if you could summarize it again here to > avoid talking past each other. What is the bpf_map_ops changes you > were thinking to do? How bpf_attr will look like? How BPF-side API for > lookup/delete/update will look like? And then let's go from there? > Thanks! Sorry for the late reply. I am a bit distracted by other work this week. For bpf_attr, a new field 'dynkey_size' is added to support BPF_MAP_{LOOKUP/UPDATE/DELETE}_ELEM and BPF_MAP_GET_NEXT_KEY on qp-trie as shown below: struct { /* anonymous struct used by BPF_MAP_*_ELEM commands */ __u32 map_fd; __aligned_u64 key; union { __aligned_u64 value; __aligned_u64 next_key; }; __u64 flags; __u32 dynkey_size; /* input/output for * BPF_MAP_GET_NEXT_KEY. input * only for other commands. */ }; And 4 new APIs are added in libbpf to support basic operations on qp-trie: LIBBPF_API int bpf_map_update_dynkey_elem(int fd, const void *key, unsigned int key_size, const void *value, __u64 flags); LIBBPF_API int bpf_map_lookup_dynkey_elem(int fd, const void *key, unsigned int key_size, void *value); LIBBPF_API int bpf_map_delete_dynkey_elem(int fd, const void *key, unsigned int key_size); LIBBPF_API int bpf_map_get_next_dynkey(int fd, const void *key, void *next_key, unsigned int *key_size); About 3 weeks again, I have used the lowest bit of key pointer in .map_lookup_elem/.map_update_elem/.map_delete_elem to distinguish between bpf_user_dynkey-typed key from syscall and bpf_dynptr_kern-typed key from bpf program. The definition of bpf_user_dynkey and its allocation method are shown below. bpf syscall uses it to allocate variable-sized key and passes it to qp-trie. /* Allocate bpf_user_dynkey and its data together */ struct bpf_user_dynkey { unsigned int size; void *data; }; static void *bpf_new_user_dynkey(unsigned int size) { struct bpf_user_dynkey *dynkey; size_t total; total = round_up(sizeof(*dynkey) + size, 2); dynkey = kvmalloc(total, GFP_USER | __GFP_NOWARN); if (!dynkey) return ERR_PTR(-ENOMEM); dynkey->size = size; dynkey->data = &dynkey[1]; return (void *)((long)dynkey | BPF_USER_DYNKEY_MARK); } After Alexei suggested that bit hack is only OK for memory or performance reason, I'm planning to add 2 new callbacks in bpf_map_ops to support update/delete operations in bpf syscall as shown below, but I have tried it yet. /* map is generic key/value storage optionally accessible by eBPF programs */ struct bpf_map_ops { /* funcs callable from userspace (via syscall) */ /* ...... */ void *(*map_lookup_elem_sys_only)(struct bpf_map *map, void *key); long (*map_update_elem_sys_only)(struct bpf_map *map, void *key, void *value, u64 flags); long (*map_delete_elem_sys_only)(struct bpf_map *map, void *key); /* ...... */ };