On 17/02/2021 10:36, Michal Hocko wrote: > On Wed 17-02-21 09:08:07, Johannes Thumshirn wrote: >> On 17/02/2021 09:03, Michal Hocko wrote: >>>> No I don't think so. A mutex isn't a spinlock so we can sleep on the allocation. >>>> We can't use GFP_KERNEL as we're about to do I/O. blk_revalidate_disk_zones() called >>>> a few line below also does the memalloc_noio_{save,restore}() dance. >>> >>> You should be extending noio scope then if this allocation falls into >>> the same category. Ideally the scope should start at the recursion place >>> and end where the scope really ened. >> >> That means all callers of blk_revalidate_disk_zones() should do >> memalloc_noio_{save,restore}? > > I am not really familiar with the IO area to answer this. The base idea > is to start the NOIO scope at the boundary which defines "unsafe to > re-enter or cannot deal with a new IO" from the reclaim path. > >> If yes, can we somehow runtime assert that this is done, so we don't >> end up with bad surprises? > > Could you elaborate? I though of lifting the noio scope into the callers of blk_revalidate_disk_zones() and then "check" in blk_revalidate_disk_zones() this has been done. But it looks like memalloc_noio_save() can handle nesting, so this is actually unneeded. > >>>> Would a kmem_cache for these revalidations help us in any way? >>> >>> I am not sure what you mean here. >>> >> >> Using a kmem_cache for the allocations passed into blk_revalidate_disk_zones(). >> I've looked into kmem_cache_alloc() and I couldn't find anything that speaks >> against it, but I'm not too familiar with the code. > > kmem_cache_alloc is only an extension to allow to allocate from a > specific cache. I do not really see how it is going to help with larger > allocation and my current understanding is that kvmalloc is used because > the requested allocation size can be large. > Ah ok so we can't set aside a big enough pool to do allocations from there, this was a misunderstanding from my side.