On Mon, Jul 29, 2019 at 02:02:52PM +0530, Anshuman Khandual wrote: > On 07/27/2019 01:24 AM, Matthew Wilcox wrote: > > On Fri, Jul 26, 2019 at 10:17:11AM +0530, Anshuman Khandual wrote: > >>> But 'page' isn't necessarily PMD-aligned. I don't think we can rely on > >>> architectures doing the right thing if asked to make a PMD for a randomly > >>> aligned page. > >>> > >>> How about finding the physical address of something like kernel_init(), > >> > >> Physical address corresponding to the symbol in the kernel text segment ? > > > > Yes. We need the address of something that's definitely memory. > > The stack might be in vmalloc space. We can't allocate memory from the > > allocator that's PUD-aligned. This seems like a reasonable approximation > > to something that might work. > > Okay sure. What is about vmalloc space being PUD aligned and how that is > problematic here ? Could you please give some details. Just being curious. Those were two different sentences. We can't use the address of something on the stack, because we don't know whether the stack is in vmalloc space or in the direct map. We can't use the address of something we've allocated from the page allocator, because the page allocator can't give us PUD-aligned memory. > > I think that's a mistake. As Russell said, the ARM p*d manipulation > > functions expect to operate on tables, not on individual entries > > constructed on the stack. > > Hmm. I assume that it will take care of dual 32 bit entry updates on arm > platform through various helper functions as Russel had mentioned earlier. > After we create page table with p?d_alloc() functions and pick an entry at > each page table level. Right. > > So I think the right thing to do here is allocate an mm, then do the > > pgd_alloc / p4d_alloc / pud_alloc / pmd_alloc / pte_alloc() steps giving > > you real page tables that you can manipulate. > > > > Then destroy them, of course. And don't access through them. > > mm_alloc() seems like a comprehensive helper to allocate and initialize a > mm_struct. But could we use mm_init() with 'current' in the driver context or we > need to create a dummy task_struct for this purpose. Some initial tests show that > p?d_alloc() and p?d_free() at each level with a fixed virtual address gives p?d_t > entries required at various page table level to test upon. I think it's wise to start a new mm. I'm not sure exactly what calls to make to get one going. > >>>> +#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD > >>>> +static void pud_basic_tests(void) > >>> > >>> Is this the right ifdef? > >> > >> IIUC THP at PUD is where the pud_t entries are directly operated upon and the > >> corresponding accessors are present only when HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD > >> is enabled. Am I missing something here ? > > > > Maybe I am. I thought we could end up operating on PUDs for kernel mappings, > > even without transparent hugepages turned on. > > In generic MM ? IIUC except ioremap mapping all other PUD handling for kernel virtual > range is platform specific. All the helpers used in the function pud_basic_tests() are > part of THP and used in mm/huge_memory.c But what about hugetlbfs? And vmalloc can also use larger pages these days. I don't think these tests should be conditional on transparent hugepages.