Am 01.03.21 um 10:21 schrieb Thomas Hellström (Intel):
On 3/1/21 10:05 AM, Daniel Vetter wrote:
On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel)
wrote:
Hi,
On 3/1/21 9:28 AM, Daniel Vetter wrote:
On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel)
<thomas_os@xxxxxxxxxxxx> wrote:
On 2/26/21 2:28 PM, Daniel Vetter wrote:
So I think it stops gup. But I haven't verified at all. Would be
good
if Christian can check this with some direct io to a buffer in
system
memory.
Hmm,
Docs (again vm_normal_page() say)
* VM_MIXEDMAP mappings can likewise contain memory with or
without "struct
* page" backing, however the difference is that _all_ pages
with a struct
* page (that is, those where pfn_valid is true) are refcounted
and
considered
* normal pages by the VM. The disadvantage is that pages are
refcounted
* (which can be slower and simply not an option for some PFNMAP
users). The
* advantage is that we don't have to follow the strict
linearity rule of
* PFNMAP mappings in order to support COWable mappings.
but it's true __vm_insert_mixed() ends up in the insert_pfn()
path, so
the above isn't really true, which makes me wonder if and in that
case
why there could any longer ever be a significant performance
difference
between MIXEDMAP and PFNMAP.
Yeah it's definitely confusing. I guess I'll hack up a patch and see
what sticks.
BTW regarding the TTM hugeptes, I don't think we ever landed that
devmap
hack, so they are (for the non-gup case) relying on
vma_is_special_huge(). For the gup case, I think the bug is still
there.
Maybe there's another devmap hack, but the ttm_vm_insert functions do
use PFN_DEV and all that. And I think that stops gup_fast from trying
to find the underlying page.
-Daniel
Hmm perhaps it might, but I don't think so. The fix I tried out was
to set
PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be
true, and
then
follow_devmap_pmd()->get_dev_pagemap() which returns NULL and
gup_fast()
backs off,
in the end that would mean setting in stone that "if there is a huge
devmap
page table entry for which we haven't registered any devmap struct
pages
(get_dev_pagemap returns NULL), we should treat that as a "special"
huge
page table entry".
From what I can tell, all code calling get_dev_pagemap() already
does that,
it's just a question of getting it accepted and formalizing it.
Oh I thought that's already how it works, since I didn't spot anything
else that would block gup_fast from falling over. I guess really would
need some testcases to make sure direct i/o (that's the easiest to test)
fails like we expect.
Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes.
Otherwise pmd_devmap() will not return true and since there is no
pmd_special() things break.
Is that maybe the issue we have seen with amdgpu and huge pages?
Apart from that I'm lost guys, that devmap and gup stuff is not
something I have a good knowledge of apart from a one mile high view.
Christian.
/Thomas
-Daniel