On Tue, Sep 29, 2020 at 04:33:16PM +0100, Matthew Wilcox wrote: > I think we can end up truncating a PMD or PGD entry (I get confused > easily about levels of the page tables; bear with me) > > /* NOTE: even on 64 bits, these entries are __u32 because we allocate > * the pmd and pgd in ZONE_DMA (i.e. under 4GB) */ > typedef struct { __u32 pgd; } pgd_t; > ... > typedef struct { __u32 pmd; } pmd_t; > > ... > > pgd_t *pgd = (pgd_t *)__get_free_pages(GFP_KERNEL, > PGD_ALLOC_ORDER); > ... > return (pmd_t *)__get_free_pages(GFP_PGTABLE_KERNEL, PMD_ORDER); > > so if we have more than 2GB of RAM, we can allocate a page with the top > bit set, which we interpret to mean PAGE_PRESENT in the TLB miss handler > and mask it off, causing us to load the wrong page for the next level > of the page table walk. > > Have I missed something? Yes, yes I have. We store the PFN, not the physical address. So we have 28 bits for storing the PFN and 4 bits for the PxD bits, supporting 28 + 12 = 40 bits (1TB) of physical address space.