On Wed, Oct 09, 2024 at 12:17:12AM -0700, Namhyung Kim wrote: > On Mon, Oct 07, 2024 at 02:57:08PM +0200, Vlastimil Babka wrote: > > On 10/4/24 11:25 PM, Roman Gushchin wrote: > > > On Fri, Oct 04, 2024 at 01:10:58PM -0700, Song Liu wrote: > > >> On Wed, Oct 2, 2024 at 11:10 AM Namhyung Kim <namhyung@xxxxxxxxxx> wrote: > > >>> > > >>> The bpf_get_kmem_cache() is to get a slab cache information from a > > >>> virtual address like virt_to_cache(). If the address is a pointer > > >>> to a slab object, it'd return a valid kmem_cache pointer, otherwise > > >>> NULL is returned. > > >>> > > >>> It doesn't grab a reference count of the kmem_cache so the caller is > > >>> responsible to manage the access. The intended use case for now is to > > >>> symbolize locks in slab objects from the lock contention tracepoints. > > >>> > > >>> Suggested-by: Vlastimil Babka <vbabka@xxxxxxx> > > >>> Acked-by: Roman Gushchin <roman.gushchin@xxxxxxxxx> (mm/*) > > >>> Acked-by: Vlastimil Babka <vbabka@xxxxxxx> #mm/slab > > >>> Signed-off-by: Namhyung Kim <namhyung@xxxxxxxxxx> > > > > > > So IIRC from our discussions with Namhyung and Arnaldo at LSF/MM I > > thought the perf use case was: > > > > - at the beginning it iterates the kmem caches and stores anything of > > possible interest in bpf maps or somewhere - hence we have the iterator > > - during profiling, from object it gets to a cache, but doesn't need to > > access the cache - just store the kmem_cache address in the perf record > > - after profiling itself, use the information in the maps from the first > > step together with cache pointers from the second step to calculate > > whatever is necessary > > Correct. > > > > > So at no point it should be necessary to take refcount to a kmem_cache? > > > > But maybe "bpf_get_kmem_cache()" is implemented here as too generic > > given the above use case and it should be implemented in a way that the > > pointer it returns cannot be used to access anything (which could be > > unsafe), but only as a bpf map key - so it should return e.g. an > > unsigned long instead? > > Yep, this should work for my use case. Maybe we don't need the > iterator when bpf_get_kmem_cache() kfunc returns the valid pointer as > we can get the necessary info at the moment. But I think it'd be less > efficient as more work need to be done at the event (lock contention). > It'd better setting up necessary info in a map before monitoring (using > the iterator), and just looking up the map with the kfunc while > monitoring the lock contention. Maybe it's still better to return a non-refcounted pointer for future use. I'll leave it for v5. Thanks, Namhyung