On Fri, Feb 14, 2025 at 9:30 AM Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > On Thu, Feb 13, 2025 at 11:25 PM Hou Tao <houtao@xxxxxxxxxxxxxxx> wrote: > > > > Hi, > > > > On 2/14/2025 2:49 PM, Hou Tao wrote: > > > Hi, > > > > > > On 2/14/2025 12:17 PM, Alexei Starovoitov wrote: > > >> On Thu, Feb 13, 2025 at 8:12 PM Hou Tao <houtao@xxxxxxxxxxxxxxx> wrote: > > >>> Hi, > > >>> > > >>> On 2/14/2025 7:56 AM, Alexei Starovoitov wrote: > > >>>> On Sat, Jan 25, 2025 at 2:59 AM Hou Tao <houtao@xxxxxxxxxxxxxxx> wrote: > > >>>>> From: Hou Tao <houtao1@xxxxxxxxxx> > > >>>>> > > >>>>> When there is bpf_dynptr field in the map key btf type or the map key > > >>>>> btf type is bpf_dyntr, set BPF_INT_F_DYNPTR_IN_KEY in map_flags. > > >>>>> > > >>>>> Signed-off-by: Hou Tao <houtao1@xxxxxxxxxx> > > >>>>> --- > > >>>>> kernel/bpf/syscall.c | 36 ++++++++++++++++++++++++++++++++++++ > > >>>>> 1 file changed, 36 insertions(+) > > >>>>> > > >>>>> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c > > >>>>> index 07c67ad1a6a07..46b96d062d2db 100644 > > >>>>> --- a/kernel/bpf/syscall.c > > >>>>> +++ b/kernel/bpf/syscall.c > > >>>>> @@ -1360,6 +1360,34 @@ static struct btf *get_map_btf(int btf_fd) > > >>>>> return btf; > > >>>>> } > > >>>>> > > > SNIP > > >>>>> #define BPF_MAP_CREATE_LAST_FIELD map_token_fd > > >>>>> /* called via syscall */ > > >>>>> static int map_create(union bpf_attr *attr) > > >>>>> @@ -1398,6 +1426,14 @@ static int map_create(union bpf_attr *attr) > > >>>>> btf = get_map_btf(attr->btf_fd); > > >>>>> if (IS_ERR(btf)) > > >>>>> return PTR_ERR(btf); > > >>>>> + > > >>>>> + err = map_has_dynptr_in_key_type(btf, attr->btf_key_type_id, attr->key_size); > > >>>>> + if (err < 0) > > >>>>> + goto put_btf; > > >>>>> + if (err > 0) { > > >>>>> + attr->map_flags |= BPF_INT_F_DYNPTR_IN_KEY; > > >>>> I don't like this inband signaling in the uapi field. > > >>>> The whole refactoring in patch 4 to do patch 6 and > > >>>> subsequent bpf_map_has_dynptr_key() in various places > > >>>> feels like reinventing the wheel. > > >>>> > > >>>> We already have map_check_btf() mechanism that works for > > >>>> existing special fields inside BTF. > > >>>> Please use it. > > >>> Yes. However map->key_record is only available after the map is created, > > >>> but the creation of hash map needs to check it before the map is > > >>> created. Instead of using an internal flag, how about adding extra > > >>> argument for both ->map_alloc_check() and ->map_alloc() as proposed in > > >>> the commit message of the previous patch ? > > >>>> map_has_dynptr_in_key_type() can be done in map_check_btf() > > >>>> after map is created, no ? > > >>> No. both ->map_alloc_check() and ->map_alloc() need to know whether > > >>> dynptr is enabled (as explained in the previous commit message). Both of > > >>> these functions are called before the map is created. > > >> Is that the explanation? > > >> " > > >> The reason for an internal map flag is twofolds: > > >> 1) user doesn't need to set the map flag explicitly > > >> map_create() will use the presence of bpf_dynptr in map key as an > > >> indicator of enabling dynptr key. > > >> 2) avoid adding new arguments for ->map_alloc_check() and ->map_alloc() > > >> map_create() needs to pass the supported status of dynptr key to > > >> ->map_alloc_check (e.g., check the maximum length of dynptr data size) > > >> and ->map_alloc (e.g., check whether dynptr key fits current map type). > > >> Adding new arguments for these callbacks to achieve that will introduce > > >> too much churns. > > >> > > >> Therefore, the patch uses the topmost bit of map_flags as the internal > > >> map flag. map_create() checks whether the internal flag is set in the > > >> beginning and bpf_map_get_info_by_fd() clears the internal flag before > > >> returns the map flags to userspace. > > >> " > > >> > > >> As commented in the other patch map_extra can be dropped (I hope). > > >> When it's gone, the map can be destroyed after creation in map_check_btf(). > > >> What am I missing? > > > If I understanding correctly, you are suggesting to replace > > > (map->map_flags & BPF_INT_F_DYNPTR_IN_KEY) with !!map->key_record, right > > > ? And you also don't want to move map_check_btf() before the invocation > > > of ->map_alloc_check() and ->map_alloc(), right ? However, beside the > > > checking of map_extra, ->map_alloc_check() also needs to know whether > > > the dynptr-typed key is suitable for current hash map type or map flags. > > > ->map_alloc() also needs to allocate a bpf mem allocator for the dynptr > > > key. So are you proposing the following steps for creating a dynkey hash > > > map: > > > > > > 1) ->map_alloc_check() > > > no change > > > > > > 2) ->map_alloc() > > > allocate bpf mem allocator for dynptr unconditionally > > > > > > 3) map_check_btf() > > > invokes an new map callback (e.g., ->map_alloc_post_check()) to check > > > whether the created map is mismatched with the dynptr key and destroy it > > > if it is. > > > > Sorry, I misread the code, so the third steps is: > > > > 3) ->map_check_btf() > > > > In ->map_check_btf() callback, check whether the created map is > > mismatched with the dynptr key. If it is, let map_create() destroys the map. > > map_check_btf() itself can have the code to filter out unsupported maps > like it does already: > case BPF_WORKQUEUE: > if (map->map_type != BPF_MAP_TYPE_HASH && > map->map_type != BPF_MAP_TYPE_LRU_HASH && > map->map_type != BPF_MAP_TYPE_ARRAY) { > ret = -EOPNOTSUPP; > > I don't mind moving map_check_btf() before ->map_alloc_check() > since it doesn't really need 'map' pointer. > I objected to partial move where btf_get_by_fd() is done early > while the rest after map allocation. > Either all map types do map_check_btf() before alloc or > all map types do it after. > > If we move map_check_btf() before alloc > then the final map->ops->map_check_btf() should probably > stay after alloc. > Otherwise this is too much churn. > > So I think it's better to try to keep the whole map_check_btf() after > as it is right now. > I don't see yet why dynptr-in-key has to have it before. > So far map_extra limitation was the only special condition, > but even if we have to keep (which I doubt) it can be done in > map->ops->map_check_btf(). Any update on this ? Two weeks have passed. iirc above was the only thing left to resolve.