On Thu, Jan 26, 2023 at 11:11 AM David Vernet <void@xxxxxxxxxxxxx> wrote: > > Hi everyone, > > Another proposal from me for LSF/MM/BPF, and the last one for the time > being. I'd like to discuss enabling local-storage maps (e.g. > BPF_MAP_TYPE_TASK_STORAGE and BPF_MAP_TYPE_CGRP_STORAGE) to be r/o > mapped directly into user space. This would allow for quick lookups of > per-object state from user space, similar to how we allow it for > BPF_MAP_TYPE_ARRAY, without having to do something like either of the > following: > > - Allocating a statically sized BPF_MAP_TYPE_ARRAY which is >= the # of > possible local-storage elements, which is likely wasteful in terms of > memory, and which isn't easy to iterate over. > > - Use something like https://docs.kernel.org/bpf/bpf_iterators.html to > iterate over tasks or cgroups, and collect information for each which > is then dumped to user space. This would probably work, but it's not > terribly performant in that it requires copying memory, trapping into > the kernel, and full iteration even when it's only necessary to look > up e.g. a single element. > > Designing and implementing this would be pretty non-trivial. We'd have > to probably do a few things: > > 1. Write an allocator that dynamically allocates statically sized > local-storage entries for local-storage maps, and populates them into > pages which are mapped into user space. > > 2. Come up with some idr-like mechanism for mapping a local-storage > object to an index into the mapping. For example, mapping a task with > global pid 12345 to BPF_MAP_TYPE_TASK_STORAGE index 5, and providing > ergonomic and safe ways to update these entries in the kernel and > communicate them to user space. > > 3. Related to point 1 above, come up with some way to dynamically extend > the user space mapping as more local-storage elements are added. We > could potentially reserve a statically sized VA range and map all > unused VA pages to the zero page, or instead possibly just leave them > unmapped until they're actually needed. > > There are a lot of open questions, but I think it could be very useful > if we can make it work. Let me know what you all think. > Hi David, I remember, I had a similar idea and played with it last year. I don't recall why I needed that feature back then, probably looking for ways to pass per-task information from userspace and read it from within BPF. I sent an RFC to the mailing list [1]. You could take a look, see whether it is of help to you. [1] https://www.spinics.net/lists/bpf/msg57565.html