Re: [PATCH bpf-next 0/3] Introduce local_storage exclusive caching

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Apr 22, 2022 at 12:05:23AM -0400, Dave Marchevsky wrote:
> On 4/21/22 9:40 PM, Alexei Starovoitov wrote:   
> > On Tue, Apr 19, 2022 at 05:21:40PM -0700, Dave Marchevsky wrote:
> >> Currently, each local_storage type (sk, inode, task) has a 16-entry
> >> cache for local_storage data associated with a particular map. A
> >> local_storage map is assigned a fixed cache_idx when it is allocated.
> >> When looking in a local_storage for data associated with a map the cache
> >> entry at cache_idx is the only place the map can appear in cache. If the
> >> map's data is not in cache it is placed there after a search through the
> >> cache hlist. When there are >16 local_storage maps allocated for a
> >> local_storage type multiple maps have same cache_idx and thus may knock
> >> each other out of cache.
> >>
> >> BPF programs that use local_storage may require fast and consistent
> >> local storage access. For example, a BPF program using task local
> >> storage to make scheduling decisions would not be able to tolerate a
> >> long hlist search for its local_storage as this would negatively affect
> >> cycles available to applications. Providing a mechanism for such a
> >> program to ensure that its local_storage_data will always be in cache
> >> would ensure fast access.
> >>
> >> This series introduces a BPF_LOCAL_STORAGE_FORCE_CACHE flag that can be
> >> set on sk, inode, and task local_storage maps via map_extras. When a map
> >> with the FORCE_CACHE flag set is allocated it is assigned an 'exclusive'
> >> cache slot that it can't be evicted from until the map is free'd. 
> >>
> >> If there are no slots available to exclusively claim, the allocation
> >> fails. BPF programs are expected to use BPF_LOCAL_STORAGE_FORCE_CACHE
> >> only if their data _must_ be in cache.
> > 
> > I'm afraid new uapi flag doesn't solve this problem.
> > Sooner or later every bpf program would deem itself "important" and
> > performance critical. All of them will be using FORCE_CACHE flag
> > and we will back to the same situation.
> 
> In this scenario, if 16 maps had been loaded w/ FORCE_CACHE flag and 17th tried
> to load, it would fail, so programs depending on the map would fail to load.

Ahh. I missed that the cache_idx is assigned at map creation time.

> Patch 2 adds a selftest 'local_storage_excl_cache_fail' demonstrating this.
> 
> > Also please share the performance data that shows more than 16 programs
> > that use local storage at the same time and existing simple cache
> > replacing logic is not enough.
> > For any kind link list walking to become an issue there gotta be at
> > least 17 progs. Two progs should pick up the same cache_idx and
> > run interleaved to evict each other. 
> > It feels like unlikely scenario, so real data would be good to see.
> > If it really an issue we might need a different caching logic.
> > Like instead of single link list per local storage we might
> > have 16 link lists. cache_idx can point to a slot.
> > If it's not 1st it will be a 2nd in much shorter link list.
> > With 16 slots the link lists will have 2 elements until 32 bpf progs
> > are using local storage.
> > We can get rid of cache too and replace with mini hash table of N
> > elements where map_id would be an index into a hash table.
> > All sorts of other algorithms are possible.
> > In any case the bpf user shouldn't be telling the kernel about
> > "importance" of its program. If program is indeed executing a lot
> > the kernel should be caching/accelerating it where it can.
> 
> It's worth noting that this is a map-level setting, not prog-level. Telling the
> kernel about importance of data feels more palatable to me. Sort of like mmap's
> MAP_LOCKED, but for local_storage cache.

For mmap it's an operational difference. Not just performance.

> Going back to the motivating example - using data in task local_storage to make
> scheduling decisions - the desire is to have the task local_storage access be
> like "accessing a task_struct member" vs "doing a search for right data to 
> access (w/ some caching to try to avoid search)".

Exactly. The motivation is performance. Let's try to make good performance
without uapi flags.
Think from user's pov. They have to pick between FORCE_CACHE == good performance
and no flag == bad? performance.
Why would anyone pick something that has worse performance?



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux