On 11/15/21 17:49, Jason Gunthorpe wrote: > On Mon, Nov 15, 2021 at 01:11:32PM +0100, Joao Martins wrote: >> On 11/12/21 16:34, Jason Gunthorpe wrote: >>> On Fri, Nov 12, 2021 at 04:08:24PM +0100, Joao Martins wrote: >>>> diff --git a/drivers/dax/device.c b/drivers/dax/device.c >>>> index a65c67ab5ee0..0c2ac97d397d 100644 >>>> +++ b/drivers/dax/device.c >>>> @@ -192,6 +192,42 @@ static vm_fault_t __dev_dax_pud_fault(struct dev_dax *dev_dax, >>>> } >>>> #endif /* !CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD */ >>>> >>>> +static void set_page_mapping(struct vm_fault *vmf, pfn_t pfn, >>>> + unsigned long fault_size, >>>> + struct address_space *f_mapping) >>>> +{ >>>> + unsigned long i; >>>> + pgoff_t pgoff; >>>> + >>>> + pgoff = linear_page_index(vmf->vma, ALIGN(vmf->address, fault_size)); >>>> + >>>> + for (i = 0; i < fault_size / PAGE_SIZE; i++) { >>>> + struct page *page; >>>> + >>>> + page = pfn_to_page(pfn_t_to_pfn(pfn) + i); >>>> + if (page->mapping) >>>> + continue; >>>> + page->mapping = f_mapping; >>>> + page->index = pgoff + i; >>>> + } >>>> +} >>>> + >>>> +static void set_compound_mapping(struct vm_fault *vmf, pfn_t pfn, >>>> + unsigned long fault_size, >>>> + struct address_space *f_mapping) >>>> +{ >>>> + struct page *head; >>>> + >>>> + head = pfn_to_page(pfn_t_to_pfn(pfn)); >>>> + head = compound_head(head); >>>> + if (head->mapping) >>>> + return; >>>> + >>>> + head->mapping = f_mapping; >>>> + head->index = linear_page_index(vmf->vma, >>>> + ALIGN(vmf->address, fault_size)); >>>> +} >>> >>> Should this stuff be setup before doing vmf_insert_pfn_XX? >>> >> >> Interestingly filesystem-dax does this, but not device-dax. > > I think it may be a bug ? > Possibly. Dan, any thoughts (see also below) ? You probably hold all that history since its inception on commit 2232c6382a4 ("device-dax: Enable page_mapping()") and commit 35de299547d1 ("device-dax: Set page->index"). >> set_page_mapping/set_compound_mapping() could be moved to before and >> then torn down on @rc != VM_FAULT_NOPAGE (failure). I am not sure >> what's the benefit in this series.. besides the ordering (that you >> hinted below) ? > > Well, it should probably be fixed in a precursor patch. > Yeap -- I would move page_mapping prior to introduce set_compound_mapping(). Now I am thinking again and with that logic it makes more sense to ammend inside set_page_mapping() -- should have less nested around in the fault handler. > I think the general idea is that page->mapping/index are stable once > the page is published in a PTE? > /me nods >>> In normal cases the page should be returned in the vmf and populated >>> to the page tables by the core code after all this is done. >> >> So I suppose by call sites examples as 'core code' is either hugetlbfs call to >> __filemap_add_folio() (on hugetlbfs fault handler), shmem_add_to_page_cache() or >> anon-equivalent. > > I was talking more about the normal page insertion flow which is done > by setting vmf->page and then returning. finish_fault() will install > the PTE > I misunderstood you earlier -- I thought you were suggesting me to look at how mapping/index is set (in the context of the flow you just described) Joao