Re: Page tables on machines with >2GB RAM

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Oct 04, 2020 at 07:29:33AM +0200, Helge Deller wrote:
> On 9/29/20 8:14 PM, Matthew Wilcox wrote:
> > It's talking about 8TB of virtual address space.  But I think it's wrong.
> > On 64-bit,
> >
> > Each PTE defines a 4kB region of address space (ie one page).
> > Each PMD is a 4kB allocation with 8-byte entries, so covers 512 * 4kB = 2MB
> 
> No, PMD is 4kb allocation with 4-byte entries, so covers 1024 * 4kb = 4MB
> We always us 4-byte entries, for 32- and 64-bit kernels.

#if CONFIG_PGTABLE_LEVELS == 3
#define PGD_ORDER       1 /* Number of pages per pgd */
#define PMD_ORDER       1 /* Number of pages per pmd */
#define PGD_ALLOC_ORDER (2 + 1) /* first pgd contains pmd */
...
#if CONFIG_PGTABLE_LEVELS == 3
...
static inline pmd_t *pmd_alloc_one(struct mm_struct *mm, unsigned long address)
{
        return (pmd_t *)__get_free_pages(GFP_PGTABLE_KERNEL, PMD_ORDER);
}

We're definitely doing an 8kB allocation.  If we should be allocating
4kB, then PMD_ORDER should be 0.

The 32-bit entries, even on 64-bit are a nice hack.  I think that just means
we're over-allocating memory for the page tables.

> > Each PGD is an 8kB allocation with 4-byte entries, so covers 2048 * 2M = 4GB
> 
> No. each PGD is a 4kb allocation with 4-byte entries. so covers 1024 * 4MB = 4GB
> Still, my calculation ends up with 4GB, like yours.

Again, I think there's an order vs count confusion here.

> > The top-level allocation is a 32kB allocation, but the first 8kB is used
> > for the first PGD, so it covers 24kB / 4 bytes * 4GB = 24TB.
> 
> size of PGD (swapper_pg_dir) is 8k, so we have 8k / 4 bytes * 4GB = 8 TB
> virtual address space.
> 
> At boot we want to map (1 << KERNEL_INITIAL_ORDER) pages (=64MB on 64bit kernel)
> and for this pmd0 gets pre-allocated with 8k size, and pg0 with 132k to
> simplify the filling the initial page tables - but that's not relevant for
> the calculations above.

I was talking about pgd_alloc():

        pgd_t *pgd = (pgd_t *)__get_free_pages(GFP_KERNEL,
                                               PGD_ALLOC_ORDER);

where we allocate 8 * 4kB pages

> > I think the top level allocation was supposed to be an order-2 allocation,
> > which would be an 8TB address space, but it's order-3.
> >
> > There's a lot of commentary which disagrees with the code.  For example,
> >
> > #define PMD_ORDER       1 /* Number of pages per pmd */
> > That's just not true; an order-1 allocation is 2 pages, not 1.
> 
> Yes, that should be fixed up.
> 
> Helge



[Index of Archives]     [Linux SoC]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux