On 8/30/19 1:15 PM, Stanislav Fomichev wrote: > On 08/29, Jakub Kicinski wrote: >> On Thu, 29 Aug 2019 16:13:59 -0700, Brian Vazquez wrote: >>>> We need a per-map implementation of the exec side, but roughly maps >>>> would do: >>>> >>>> LIST_HEAD(deleted); >>>> >>>> for entry in map { >>>> struct map_op_ctx { >>>> .key = entry->key, >>>> .value = entry->value, >>>> }; >>>> >>>> act = BPF_PROG_RUN(filter, &map_op_ctx); >>>> if (act & ~ACT_BITS) >>>> return -EINVAL; >>>> >>>> if (act & DELETE) { >>>> map_unlink(entry); >>>> list_add(entry, &deleted); >>>> } >>>> if (act & STOP) >>>> break; >>>> } >>>> >>>> synchronize_rcu(); >>>> >>>> for entry in deleted { >>>> struct map_op_ctx { >>>> .key = entry->key, >>>> .value = entry->value, >>>> }; >>>> >>>> BPF_PROG_RUN(dumper, &map_op_ctx); >>>> map_free(entry); >>>> } >>>> >>> Hi Jakub, >>> >>> how would that approach support percpu maps? >>> >>> I'm thinking of a scenario where you want to do some calculations on >>> percpu maps and you are interested on the info on all the cpus not >>> just the one that is running the bpf program. Currently on a pcpu map >>> the bpf_map_lookup_elem helper only returns the pointer to the data of >>> the executing cpu. >> >> Right, we need to have the iteration outside of the bpf program itself, >> and pass the element in through the context. That way we can feed each >> per cpu entry into the program separately. > My 2 cents: > > I personally like Jakub's/Quentin's proposal more. So if I get to choose > between this series and Jakub's filter+dump in BPF, I'd pick filter+dump > (pending per-cpu issue which we actually care about). > > But if we can have both, I don't have any objections; this patch > series looks to me a lot like what Brian did, just extended to more > commands. If we are fine with the shortcomings raised about the > original series, then let's go with this version. Maybe we can also > look into addressing these independently. > > But if I pretend that we live in an ideal world, I'd just go with > whatever Jakub and Quentin are doing so we don't have to support > two APIs that essentially do the same (minus batching update, but > it looks like there is no clear use case for that yet; maybe). > > I guess you can hold off this series a bit and discuss it at LPC, > you have a talk dedicated to that :-) (and afaiu, you are all going) Absolutely. We will have a discussion on map batching and I signed on with that :-). One of goals for this patch set is for me to explore what uapi (attr and bpf subcommands) we should expose to users. Hopefully at that time we will get more clarity on Jakub's approach and we can discuss how to proceed.