Re: Question: Is it OK to assume the address of bpf_dynptr_kern will be 8-bytes aligned and reuse the lowest bits to save extra info ?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On 8/22/2023 7:49 AM, Alexei Starovoitov wrote:
> On Sat, Aug 19, 2023 at 3:39 AM Hou Tao <houtao@xxxxxxxxxxxxxxx> wrote:
>> Hi,
>>
>> On 8/18/2023 7:00 AM, Alexei Starovoitov wrote:
>>> On Wed, Aug 16, 2023 at 11:35 PM Hou Tao <houtao@xxxxxxxxxxxxxxx> wrote:
>>>> ping ?
>>> Sorry for the delay. I've been on PTO.
>>>
>>>> On 8/3/2023 9:28 PM, Hou Tao wrote:
>>>>> On 8/3/2023 9:19 PM, Hou Tao wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I am preparing for qp-trie v4, but I need some help on how to support
>>>>>> variable-sized key in bpf syscall. The implementation of qp-trie needs
>>>>>> to distinguish between dynptr key from bpf program and variable-sized
>>>>>> key from bpf syscall. In v3, I added a new dynptr type:
>>>>>> BPF_DYNPTR_TYPE_USER for variable-sized key from bpf syscall [0], so
>>>>>> both bpf program and bpf syscall will use the same type to represent the
>>>>>> variable-sized key, but Andrii thought ptr+size tuple was simpler and
>>>>>> would be enough for user APIs, so in v4, the type of key for bpf program
>>>>>> and syscall will be different. One way to handle that is to add a new
>>>>>> parameter in .map_lookup_elem()/.map_delete_elem()/.map_update_elem() to
>>>>>> tell whether the key comes from bpf program or syscall or introduce new
>>>>>> APIs in bpf_map_ops for variable-sized key related syscall, but I think
>>>>>> it will introduce too much churn. Considering that the size of
>>>>>> bpf_dynptr_kern is 8-bytes aligned, so I think maybe I could reuse the
>>>>>> lowest 1-bit of key pointer to tell qp-trie whether or not it is a
>>>>>> bpf_dynptr_kern or a variable-sized key pointer from syscall. For
>>>>>> bpf_dynptr_kern, because it is 8B-aligned, so its lowest bit must be 0,
>>>>>> and for variable-sized key from syscall, I could allocated a 4B-aligned
>>>>>> pointer and setting the lowest bit as 1, so qp-trie can distinguish
>>>>>> between these two types of pointer. The question is that I am not sure
>>>>>> whether the idea above is a good one or not. Does it sound fragile ? Or
>>>>>> is there any better way to handle that ?
>>> Let's avoid bit hacks. They're not extensible and should be used
>>> only in cases where performance matters a lot or memory constraints are extreme.
>> I see. Neither the performance reason nor the memory limitation fit here.
>>> ptr/sz tuple from syscall side sounds the simplest.
>>> I agree with Andrii exposing the dynptr concept to user space
>>> and especially as part of syscall is unnecessary.
>>> We already have LPM as a precedent. Maybe we can do the same here?
>>> No need to add new sys_bpf commands.
>> There is no need to add new sys_bpf commands. We can extend bpf_attr to
>> support variable-sized key in qp-trie for bpf syscall. The probem is the
>> keys from bpf syscall and bpf program are different: bpf syscall uses
>> ptr+size tuple and bpf program uses dynptr, but the APIs in bpf_map_ops
>> only uses a pointer to represent the key, so qp-trie can not distinguish
>> between the keys from bpf syscall and bpf program. In qp-trie v1, the
>> key of qp-trie was similar with LPM-trie: both the syscall and program
>> used the same key format. But the key format for bpf program changed to
>> dynptr in qp-trie v2 according to the suggestion from Andrii. I think it
>> is also a bad ideal to go back to v1 again.
>>
>>> If the existing bpf_map_lookup_elem() helper doesn't fit qp-tree we can
>>> use new kfuncs from bpf prog and LPM-like map accessors from syscall.
>> It is a feasible solution, but it will make qp-trie be different with
>> other map types. I will try to extend the APIs in bpf_map_ops first to
>> see how much churns it may introduce.
> you mean you want to try to dynamically adapt bpf_map_lookup_elem()
> to consider 'void *key' as a pointer to dynptr for bpf prog and
> lpm-like tuple for syscall?
> I'm afraid the verifier changes will be messy, since PTR_TO_MAP_KEY is
> quite special.

No. I didn't plan to do that. I am trying to add three new APIs in
bpf_map_ops to handle lookup/update/delete operation from bpf syscall
(e.g, map_lookup_elem_syscall). So bpf program and bpf syscall can use
different API to operate on qp-trie.
> __bpf_kfunc void *bpf_qptree_lookup(const bpf_map *map, const struct
> bpf_dynptr_kern *key, ...);
> will be so much easier to add.





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux