On Fri, Mar 08, 2019 at 08:51:03PM +0000, Saeed Mahameed wrote: > > Thoughts ? It's certainly an interesting idea. I think we need to agree on use cases and goals first before bikesheding on the solution. > > Example use cases (XDP only for now): > > 1) Query XDP stats of all XDP netdevs: > > xdp_ctrl = bpf_create_map(BPF_MAP_TYPE_CONTROL, map_attr.ctrl_type = > XDP_STATS); > > while (bpf_map_get_next_key(xdp_ctrl, &ifindex, &next_ifindex) == 0) { > bpf_map_lookup_elem(xdp_ctrl, &next_ifindex, &stats); > // we don't even need to know stats format in this case > btf_pretty_print(xdp_ctrl->btf, &stats); > ifindex = next_ifindex; > } this bit show cases advantage of BTF nicely. > 2) Setup XDP tx redirect resources on egress netdev (netdev with no XDP > program). > > xdp_ctrl = bpf_create_map(BPF_MAP_TYPE_CONTROL, map_attr.ctrl_type = > XDP_ATTR); > > xdp_attr->command = SETUP_REDIRECT; > xdp_attr->rings.num = 12; > xdp_attr->rings.size = 128; > bpf_map_update_elem(xdp_ctrl, &ifindex, &xdp_attr); this one starting to become a bit odd, since input arguments are split between key an value. ifindex is part of the key whereas command, rings.num and rings.size are part of the value. > 3) Turn On/Off XDP meta data offloads and retrieve meta data BTF format > of specific netdev/hardware: > > xdp_ctrl = bpf_create_map(BPF_MAP_TYPE_CONTROL, map_attr.ctrl_type = > XDP_ATTR); > > xdp_attr->command = SETUP_MD; > xdp_attr->enable_md = 1; > err = bpf_map_update_elem(xdp_ctrl, &ifindex, &xdp_attr); > if (err) { > printf("XDP meta data is not supported on this netdev"); > return; > } > // Query Meta data BTF > bpf_map_lookup_elem(xdp_ctrl, &ifindex, &xdp_attr); > md_btf = xdp_attr.md_btf; here it gets even weirder, since lookup arguments are also split between key and value. ifindex is inside the key while addition info is passed inside xdp_attr which is part of value. I wish we could do maps where every key would have different layout. Then such api would be suitable. The existing maps require all keys and values to be inform. I guess one can argue that such 'control map' can have one element and one value, but then what is the purpose of 'map_create' and 'map_delete==close(fd)' operations? I think we need something else here. Using BTF to describe output stats is nice, but using BTF to describe input query is problematic, since user cannot know before hand what kernel can and cannot accept. imo input should be stable uapi with fixed constants whereas stats-like output can be BTF based and vary from driver to driver and from one nic version to another.