On Sun, Apr 07, 2019 at 01:55:14AM +0200, Daniel Borkmann wrote: > On 04/06/2019 06:54 PM, Alexei Starovoitov wrote: > > On Sat, Apr 06, 2019 at 12:58:23PM +0200, Daniel Borkmann wrote: > >> On 04/06/2019 03:56 AM, Alexei Starovoitov wrote: > >>> On Fri, Apr 05, 2019 at 10:59:27PM +0200, Daniel Borkmann wrote: > >>>> > >>>> -/* when bpf_ldimm64->src_reg == BPF_PSEUDO_MAP_FD, bpf_ldimm64->imm == fd */ > >>>> +/* When BPF ldimm64's insn[0].src_reg != 0 then this can have > >>>> + * two extensions: > >>>> + * > >>>> + * insn[0].src_reg: BPF_PSEUDO_MAP_FD BPF_PSEUDO_MAP_VALUE > >>>> + * insn[0].imm: map fd map fd > >>>> + * insn[1].imm: 0 offset into value > >>>> + * insn[0].off: 0 lower 16 bit of map index > >>>> + * insn[1].off: 0 higher 16 bit of map index > >>>> + * ldimm64 rewrite: address of map address of map[index]+offset > >>>> + * verifier type: CONST_PTR_TO_MAP PTR_TO_MAP_VALUE > >>> ... > >>>> + else if (insn->src_reg == BPF_PSEUDO_MAP_VALUE) > >>>> + snprintf(dd->scratch_buff, sizeof(dd->scratch_buff), > >>>> + "map[id:%u][%u]+%u", insn->imm, > >>>> + ((__u32)(__u16)insn[0].off) | > >>>> + ((__u32)(__u16)insn[1].off) << 16, > >>>> + (insn + 1)->imm); > >>> > >>> Hopefully one last nit... > >>> Do we really need to allow this odd split index support? > >>> Later patches enforce array of 1 element and libbpf only uses that. > >>> This index feature feels too quirky and not really useful at this moment. > >>> Can we enforce that insn[0|1].off == 0 instead ? > >>> Later we can extend it to mean index without breaking anything. > >> > >> I originally didn't have it in v2 of the series, but I ended up > >> implementing it after feedback from Andrii back then complaining > >> that it's too specific and not generic enough. I agreed with him > >> that the limitation of max_elems = 1 wasn't too nice, so I went > >> to implement that full 32 bit index can be used thus that it has > >> the potential of efficient map lookup replacement for array maps in > >> general which is quite nice since within single insn it allows to > >> select index and offset into value all as simple 64 bit imm load. > > > > I missed this discussion. > > It sort-of sounds nice from kernel side, but how one can use it > > from bpf program written in C ? > > If it's assembler only feature, I'd rather not do it. > > statc int ar1[N]; > > and > > static struct S { ...} ar2[M]; > > will still be normal map of 1 element from llvm side. > > Right now there is no support for variable length access > > into static vars. When it's added in the future the ar[var] will be > > some base offset into map of 1 element plus register addition. > > So no opportunity to use 'index'. > > bpf_map_lookup_elem(map, &key); from program side passes > > a pointer and it's a function call. Even if key is constant > > register spill/fill due to function call caused perf loss, > > so extra 'index' optimization won't buy much. > > We can introduce some special intrinsic/builtin to support > > this 'index' from C, but it's not pretty. > > So far I couldn't come up with C example that can use such 'index' feature. > > Wrt 'index' was mostly thinking into llvm-builtin direction for how > it could be consumed from C aside from loaders. One example for complex > networking programs that comes to mind right away would be mib-style error > counters similarly we have in the stack where the counter acts as index for > the direct lookup. how would such builtin look? I still don't see it fiting into C code. > Another option [for loaders] could be (given the sections > are for the whole object) to allow an option for some of the programs in > a given object to have private .bss/.data/.rodata sections by allocating > max_elems > 1 and selecting the target buffer via index given nothing > changes in terms of size or vars. I still think it's a useful extension > to carry and keeps the direct value access a generic implementation. I don't get this shadow vs normal .data idea. The more we talk the more I'm convinced that this is not a good api. Say in the future we indeed have these shadow + normal .data then just use the same insn->imm field to refer to shadow part. Even if there are N such regions. The value_size is known. So use 0<=imm<value_size to refer to 'index' 0 and value_size<=imm<value_size*2 to refer to 'index' 1. There is absolutely no need for offset and index to be separate. Address of a byte inside bpf array can be expressed with single integer.