Re: Question: Is it OK to assume the address of bpf_dynptr_kern will be 8-bytes aligned and reuse the lowest bits to save extra info ?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Andrii,

On 8/26/2023 2:33 AM, Andrii Nakryiko wrote:
> On Tue, Aug 22, 2023 at 6:12 AM Hou Tao <houtao@xxxxxxxxxxxxxxx> wrote:
SNIP
>>>> Yes. bpf prog will use dynptr as the map key. The bpf program will use
>>>> the same map helpers as hash map to operate on qp-trie and the verifier
>>>> will be updated to allow using dynptr as map key for qp-trie.
>>> And that's the problem I just mentioned.
>>> PTR_TO_MAP_KEY is special. I don't think we should hack it to also
>>> mean ARG_PTR_TO_DYNPTR depending on the first argument (map type).
>> Sorry for misunderstanding your reply. But before switch to the kfunc
>> way, could you please point me to some code or function which shows the
>> specialty of PTR_MAP_KEY ?
>>
>>
> Search in kernel/bpf/verifier.c how PTR_TO_MAP_KEY is handled. The
> logic assumes that there is associated struct bpf_map * pointer from
> which we know fixed-sized key length.

Thanks for the information. Will check that.
>
> But getting back to the topic at hand. I vaguely remember discussion
> we had, but it would be good if you could summarize it again here to
> avoid talking past each other. What is the bpf_map_ops changes you
> were thinking to do? How bpf_attr will look like? How BPF-side API for
> lookup/delete/update will look like? And then let's go from there?
> Thanks!

Sorry for the late reply. I am a bit distracted by other work this week.

For bpf_attr, a new field 'dynkey_size' is added to support
BPF_MAP_{LOOKUP/UPDATE/DELETE}_ELEM and BPF_MAP_GET_NEXT_KEY on qp-trie
as shown below:

struct { /* anonymous struct used by BPF_MAP_*_ELEM commands */
        __u32           map_fd;
        __aligned_u64   key;
        union {
                __aligned_u64 value;
                __aligned_u64 next_key;
        };
        __u64           flags;
        __u32           dynkey_size;    /* input/output for
                                         * BPF_MAP_GET_NEXT_KEY. input
                                         * only for other commands.
                                         */
};

And 4 new APIs are added in libbpf to support basic operations on qp-trie:

LIBBPF_API int bpf_map_update_dynkey_elem(int fd, const void *key,
unsigned int key_size, const void *value, __u64 flags);
LIBBPF_API int bpf_map_lookup_dynkey_elem(int fd, const void *key,
unsigned int key_size, void *value);
LIBBPF_API int bpf_map_delete_dynkey_elem(int fd, const void *key,
unsigned int key_size);
LIBBPF_API int bpf_map_get_next_dynkey(int fd, const void *key, void
*next_key, unsigned int *key_size);

About 3 weeks again, I have used the lowest bit of key pointer in
.map_lookup_elem/.map_update_elem/.map_delete_elem to distinguish
between bpf_user_dynkey-typed key from syscall and bpf_dynptr_kern-typed
key from bpf program. The definition of bpf_user_dynkey and its
allocation method are shown below. bpf syscall uses it to allocate
variable-sized key and passes it to qp-trie.

/* Allocate bpf_user_dynkey and its data together */
struct bpf_user_dynkey {
        unsigned int size;
        void *data;
};

static void *bpf_new_user_dynkey(unsigned int size)
{
        struct bpf_user_dynkey *dynkey;
        size_t total;

        total = round_up(sizeof(*dynkey) + size, 2);
        dynkey = kvmalloc(total, GFP_USER | __GFP_NOWARN);
        if (!dynkey)
                return ERR_PTR(-ENOMEM);

        dynkey->size = size;
        dynkey->data = &dynkey[1];
        return (void *)((long)dynkey | BPF_USER_DYNKEY_MARK);
}


After Alexei suggested that bit hack is only OK for memory or
performance reason, I'm planning to add 2 new callbacks in bpf_map_ops
to support update/delete operations in bpf syscall as shown below, but I
have tried it yet.

/* map is generic key/value storage optionally accessible by eBPF
programs */
struct bpf_map_ops {
        /* funcs callable from userspace (via syscall) */
        /* ...... */
        void *(*map_lookup_elem_sys_only)(struct bpf_map *map, void *key);
        long (*map_update_elem_sys_only)(struct bpf_map *map, void *key,
void *value, u64 flags);
        long (*map_delete_elem_sys_only)(struct bpf_map *map, void *key);
        /* ...... */
};










[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux