Re: [PATCH] drm/ttm: stop warning on TT shrinker failure

Christian König <christian.koenig@xxxxxxx> · Tue, 23 Mar 2021 14:56:54 +0100

Am 23.03.21 um 14:41 schrieb Michal Hocko:
On Tue 23-03-21 14:06:25, Christian König wrote:
Am 23.03.21 um 13:37 schrieb Michal Hocko:
On Tue 23-03-21 13:21:32, Christian König wrote:
[...]
Ideally I would like to be able to trigger swapping out the shmem page I
allocated immediately after doing the copy.
So let me try to rephrase to make sure I understand. You would like to
swap out the existing content from the shrinker and you use shmem as a
way to achieve that. The swapout should happen at the time of copying
(shrinker context) or shortly afterwards?

So effectively to call pageout() on the shmem page after the copy?
Yes, exactly that.
OK, good. I see what you are trying to achieve now. I do not think we
would want to allow pageout from the shrinker's context but what you can
do is to instantiate the shmem page into the tail of the inactive list
so the next reclaim attempt will swap it out (assuming swap is available
of course).

Yes, that's at least my understanding of how we currently do it.

Problem with that approach is that I first copy over the whole object 
into shmem and then free it.

So instead of temporary using a single page, I need whatever the buffer 
object is in size as temporary storage for the shmem object and that can 
be a couple of hundred MiB.

This is not really something that our existing infrastructure gives you
though, I am afraid. There is no way to tell a newly allocated shmem
page should be in fact cold and the first one to swap out. But there are
people more familiar with shmem and its pecularities so I might be wrong
here.

Anyway, I am wondering whether the overall approach is sound. Why don't
you simply use shmem as your backing storage from the beginning and pin
those pages if they are used by the device?

Yeah, that is exactly what the Intel guys are doing for their integrated 
GPUs :)

Problem is for TTM I need to be able to handle dGPUs and those have all 
kinds of funny allocation restrictions. In other words I need to 
guarantee that the allocated memory is coherent accessible to the GPU 
without using SWIOTLB.

The simple case is that the device can only do DMA32, but you also got 
device which can only do 40bits or 48bits.

On top of that you also got AGP, CMA and stuff like CPU cache behavior 
changes (write back vs. write through, vs. uncached).

Regards,
Christian.
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/dri-devel