On Sun, Oct 04, 2020 at 07:29:33AM +0200, Helge Deller wrote: > On 9/29/20 8:14 PM, Matthew Wilcox wrote: > > It's talking about 8TB of virtual address space. But I think it's wrong. > > On 64-bit, > > > > Each PTE defines a 4kB region of address space (ie one page). > > Each PMD is a 4kB allocation with 8-byte entries, so covers 512 * 4kB = 2MB > > No, PMD is 4kb allocation with 4-byte entries, so covers 1024 * 4kb = 4MB > We always us 4-byte entries, for 32- and 64-bit kernels. #if CONFIG_PGTABLE_LEVELS == 3 #define PGD_ORDER 1 /* Number of pages per pgd */ #define PMD_ORDER 1 /* Number of pages per pmd */ #define PGD_ALLOC_ORDER (2 + 1) /* first pgd contains pmd */ ... #if CONFIG_PGTABLE_LEVELS == 3 ... static inline pmd_t *pmd_alloc_one(struct mm_struct *mm, unsigned long address) { return (pmd_t *)__get_free_pages(GFP_PGTABLE_KERNEL, PMD_ORDER); } We're definitely doing an 8kB allocation. If we should be allocating 4kB, then PMD_ORDER should be 0. The 32-bit entries, even on 64-bit are a nice hack. I think that just means we're over-allocating memory for the page tables. > > Each PGD is an 8kB allocation with 4-byte entries, so covers 2048 * 2M = 4GB > > No. each PGD is a 4kb allocation with 4-byte entries. so covers 1024 * 4MB = 4GB > Still, my calculation ends up with 4GB, like yours. Again, I think there's an order vs count confusion here. > > The top-level allocation is a 32kB allocation, but the first 8kB is used > > for the first PGD, so it covers 24kB / 4 bytes * 4GB = 24TB. > > size of PGD (swapper_pg_dir) is 8k, so we have 8k / 4 bytes * 4GB = 8 TB > virtual address space. > > At boot we want to map (1 << KERNEL_INITIAL_ORDER) pages (=64MB on 64bit kernel) > and for this pmd0 gets pre-allocated with 8k size, and pg0 with 132k to > simplify the filling the initial page tables - but that's not relevant for > the calculations above. I was talking about pgd_alloc(): pgd_t *pgd = (pgd_t *)__get_free_pages(GFP_KERNEL, PGD_ALLOC_ORDER); where we allocate 8 * 4kB pages > > I think the top level allocation was supposed to be an order-2 allocation, > > which would be an 8TB address space, but it's order-3. > > > > There's a lot of commentary which disagrees with the code. For example, > > > > #define PMD_ORDER 1 /* Number of pages per pmd */ > > That's just not true; an order-1 allocation is 2 pages, not 1. > > Yes, that should be fixed up. > > Helge