Re: [PATCH 17/35] drm/amdkfd: register HMM device private zone

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 3/1/21 9:58 AM, Daniel Vetter wrote:
On Mon, Mar 01, 2021 at 09:46:44AM +0100, Thomas Hellström (Intel) wrote:
On 3/1/21 9:32 AM, Daniel Vetter wrote:
On Wed, Jan 06, 2021 at 10:01:09PM -0500, Felix Kuehling wrote:
From: Philip Yang <Philip.Yang@xxxxxxx>

Register vram memory as MEMORY_DEVICE_PRIVATE type resource, to
allocate vram backing pages for page migration.

Signed-off-by: Philip Yang <Philip.Yang@xxxxxxx>
Signed-off-by: Felix Kuehling <Felix.Kuehling@xxxxxxx>
So maybe I'm getting this all wrong, but I think that the current ttm
fault code relies on devmap pte entries (especially for hugepte entries)
to stop get_user_pages. But this only works if the pte happens to not
point at a range with devmap pages.
I don't think that's in TTM yet, but the proposed fix, yes (see email I just
sent in another thread),
but only for huge ptes.

This patch here changes that, and so probably breaks this devmap pte hack
ttm is using?

If I'm not wrong here then I think we need to first fix up the ttm code to
not use the devmap hack anymore, before a ttm based driver can register a
dev_pagemap. Also adding Thomas since that just came up in another
discussion.
It doesn't break the ttm devmap hack per se, but it indeed allows gup to the
range registered, but here's where my lack of understanding why we can't
allow gup-ing TTM ptes if there indeed is a backing struct-page? Because
registering MEMORY_DEVICE_PRIVATE implies that, right?
We need to keep supporting buffer based memory management for all the
non-compute users. Because those require end-of-batch dma_fence semantics,
which prevents us from using gpu page faults, which makes hmm not really
work.

And for buffer based memory manager we can't have gup pin random pages in
there, that's not really how it works. Worst case ttm just assumes it can
actually move buffers and reallocate them as it sees fit, and your gup
mapping (for direct i/o or whatever) now points at a page of a buffer that
you don't even own anymore. That's not good. Hence also all the
discussions about preventing gup for bo mappings in general.

Once we throw hmm into the mix we need to be really careful that the two
worlds don't collide. Pure hmm is fine, pure bo managed memory is fine,
mixing them is tricky.
-Daniel

Hmm, OK so then registering MEMORY_DEVICE_PRIVATE means we can't set pxx_devmap because that would allow gup, which, in turn, means no huge TTM ptes.

/Thomas

_______________________________________________
amd-gfx mailing list
amd-gfx@xxxxxxxxxxxxxxxxxxxxx
https://lists.freedesktop.org/mailman/listinfo/amd-gfx




[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux