Am 23.03.21 um 13:04 schrieb Michal Hocko:
On Tue 23-03-21 12:48:58, Christian König wrote:
Am 23.03.21 um 12:28 schrieb Daniel Vetter:
On Tue, Mar 23, 2021 at 08:38:33AM +0100, Michal Hocko wrote:
On Mon 22-03-21 20:34:25, Christian König wrote:
[...]
My only concern is that if I could rely on memalloc_no* being used we could
optimize this quite a bit further.
Yes you can use the scope API and you will be guaranteed that _any_
allocation from the enclosed context will inherit GFP_NO* semantic.
The question is if this is also guaranteed the other way around?
In other words if somebody calls get_free_page(GFP_NOFS) are the context
flags set as well?
gfp mask is always restricted in the page allocator. So say you have
noio scope context and call get_free_page/kmalloc(GFP_NOFS) then the
scope would restrict the allocation flags to GFP_NOIO (aka drop
__GFP_IO). For further details, have a look at current_gfp_context
and its callers.
Does this answer your question?
But what happens if you don't have noio scope and somebody calls
get_free_page(GFP_NOFS)?
Is then the noio scope added automatically? And is it possible that the
shrinker gets called without noio scope even we would need it?
I think this is where I don't get yet what Christian tries to do: We
really shouldn't do different tricks and calling contexts between direct
reclaim and kswapd reclaim. Otherwise very hard to track down bugs are
pretty much guaranteed. So whether we use explicit gfp flags or the
context apis, result is exactly the same.
Ok let us recap what TTMs TT shrinker does here:
1. We got memory which is not swapable because it might be accessed by the
GPU at any time.
2. Make sure the memory is not accessed by the GPU and driver need to grab a
lock before they can make it accessible again.
3. Allocate a shmem file and copy over the not swapable pages.
This is quite tricky because the shrinker operates in the PF_MEMALLOC
context so such an allocation would be allowed to completely deplete
memory unless you explicitly mark that context as __GFP_NOMEMALLOC.
Thanks, exactly that was one thing I was absolutely not sure about. And
yes I agree that this is really tricky.
Ideally I would like to be able to trigger swapping out the shmem page I
allocated immediately after doing the copy.
This way I would only need a single page for the whole shrink operation
at any given time.
Also note that if the allocation cannot succeed it will not trigger reclaim
again because you are already called from the reclaim context.
4. Free the not swapable/reclaimable pages.
The pages we got from the shmem file are easily swapable to disk after the
copy is completed. But only if IO is not already blocked because the
shrinker was called from an allocation restricted by GFP_NOFS or GFP_NOIO.
Sorry for being dense here but I still do not follow the actual problem
(well, except for the above mentioned one). Is the sole point of this to
emulate a GFP_NO* allocation context and see how shrinker behaves?
Please be as dense as you need to be :)
I think Daniel and I only have a very rough understanding of the memory
management details here, but we need exactly that knowledge to get the
GPU memory management into the shape we want it to be.
Thanks,
Christian.