On Thu, Jul 15, 2021 at 07:37:27PM +0100, Russell King (Oracle) wrote: > On Thu, Jul 15, 2021 at 07:10:54PM +0100, Matthew Wilcox wrote: > > On Thu, Jul 15, 2021 at 05:47:41PM +0100, Russell King (Oracle) wrote: > > > On Thu, Jul 15, 2021 at 02:46:10PM +0100, Matthew Wilcox (Oracle) wrote: > > > > This is the order of the page table allocation, not the order of a PMD. > > > > -#define PMD_ORDER 3 > > > > +#define PMD_TABLE_ORDER 3 > > > > #else > > > > #define PG_DIR_SIZE 0x4000 > > > > -#define PMD_ORDER 2 > > > > +#define PMD_TABLE_ORDER 2 > > > > > > I think PMD_ENTRY_ORDER would make more sense here - this is the > > > power-of-2 of an individual PMD entry, not of the entire table. > > > > But ... we have two kinds of PMD entries. We have the direct entry that > > points to a 1-16MB sized chunk of memory, and we have the table entry that > > points to a 4k-32k chunk of memory that contains PTEs. So I don't think > > calling it 'entry' order actually disambiguates anything. That's why > > I went with 'table' -- I can't think of anything else to call it! > > PMD_PTE_ARRAY_ORDER doesn't seem like an improvement to me ... > > There may be two kinds of PMD entries, but that isn't relevant here. > Going back to the original terminology, 1 << PMD_ORDER here is the > size of each PMD entry. It doesn't have anything to do with how much > memory is being mapped by each entry. Oh. Oh! So, 'order' is usually a shift that is _added on to_ the PAGE_SHIFT in order to find how many bytes are in question. See include/asm-generic/getorder.h. Now, PMD_SHIFT is already in use, but perhaps what is meant here is PMD_ENTRY_SHIFT? > I think what is confusing you is stuff like: > > add r0, r4, #KERNEL_OFFSET >> (SECTION_SHIFT - PMD_ORDER) > > r4 is the base address of the page tables, and r0 is the address of > the entry we want to manipulate for "KERNEL_OFFSET" - which is the > virtual address. 1 << SECTION_SHIFT is how much memory each entry > maps (and this is fixed here - there's no variability as you suggest > above.) (the variability I intended above was more to accommodate architectural differences; I hate to use x86-specific numbers like 4KiB and 2MiB) > Effectively, the calculation above is: > > index = KERNEL_OFFSET >> SECTION_SHIFT; > pmd_entry_size = 1 << PMD_ORDER; > r0 = base + index * pmd_entry_size; > > but in a single instruction as we can be sure that KERNEL_OFFSET will > have zeros as the low bits after shifting by SECTION_SHIFT - PMD_ORDER. > > Hope this helps to explain what this PMD_ORDER is actually doing here. Thank you, yes, I was terminally confused.