Hi Bhupesh, On 2/12/2019 3:44 AM, Bhupesh Sharma wrote: >>>> +/* See 'arch/arm64/include/asm/pgtable-hwdef.h' for definitions below */ >>>> + >>>> +/* >>>> + * Number of page-table levels required to address 'va_bits' wide >>>> + * address, without section mapping. We resolve the top (va_bits - PAGE_SHIFT) >>>> + * bits with (PAGE_SHIFT - 3) bits at each page table level. Hence: >>>> + * >>>> + * levels = DIV_ROUND_UP((va_bits - PAGE_SHIFT), (PAGE_SHIFT - 3)) >>>> + * >>>> + * where DIV_ROUND_UP(n, d) => (((n) + (d) - 1) / (d)) >>>> + * >>>> + * We cannot include linux/kernel.h which defines DIV_ROUND_UP here >>>> + * due to build issues. So we open code DIV_ROUND_UP here: >>>> + * >>>> + * ((((va_bits) - PAGE_SHIFT) + (PAGE_SHIFT - 3) - 1) / (PAGE_SHIFT - 3)) >>>> + * >>>> + * which gets simplified as : >>>> + */ >>>> +#define ARM64_HW_PGTABLE_LEVELS(va_bits) (((va_bits) - 4) / (PAGE_SHIFT - 3)) >> >> Is this needed? > > Yes, it is needed for both the LPA and LVA patches (the LVA patch was sent out separately), since we need to calculate values like PMD_SHIFT on basis of the page-table levels. > > Also this makes the makedumpfile code confirm more to the recent kernel definitions as otherwise one needs to read the makedumpfile code and dig kernel change history to understand the calculation of these macros. > > In future also, I plan to keep these values in sync with the kernel and send patches accordingly. I agree to sync these macros with the kernel, but currently this one is not used in your patches, and you wrote the path of the source file (pgtable-hwdef.h) above for reference, so we don't need to import this one for now. I'd like to import it when it is needed. >>>> +/* Highest possible physical address supported */ >>>> +static inline int >>>> +get_phy_mask_shift_arm64(void) >>>> +{ >>>> + int pa_bits = 48 /* default: 48-bit PA */; >>>> + >>>> + if (NUMBER(MAX_PHYSMEM_BITS) != NOT_FOUND_NUMBER) >>>> + pa_bits = NUMBER(MAX_PHYSMEM_BITS); >>>> + >>>> + return pa_bits; >>>> +} >> >> Is this needed to be an inline function? > > IMO, its better to keep this as inline. Once you modify the part of setting info->max_physmem_bits below, we can remove this function, because we don't use PHYS_MASK_SHIFT anymore. >>>> -#define PMD_TYPE_MASK 3 >>>> -#define PMD_TYPE_SECT 1 >>>> -#define PMD_TYPE_TABLE 3 >>>> +#define PHYS_MASK_SHIFT get_phy_mask_shift_arm64() >>>> >>>> -#define PUD_TYPE_MASK 3 >>>> -#define PUD_TYPE_SECT 1 >>>> -#define PUD_TYPE_TABLE 3 >>>> +/* Helper API to convert between a physical address and its placement >>>> + * in a page table entry, taking care of 52-bit addresses. >>>> + */ >>>> +static inline unsigned long >>>> +__pte_to_phys(pte_t pte) >>>> +{ >>>> + if (lpa_52_bit_support_available) >> >> OK. >> >> According to the related emails, it looks like "PAGESIZE() == SZ_64K" >> is also usable here, but this is the same condition as kernel's one >> and easy to understand. > > No, like I clarified to the upstream maintainers, distributions like Fedora already support a default page size of 64K, but the PA address space can be 48 or 52, depending on the kernel version and kernel CONFIG flags used. My understanding is that with 64k page, we can convert a page table entry to a physicall address without awareness of 52-bit. According to this patch, the top 4 bits of a 52-bit physical address are positioned at bits 12..15 of a page table entry. commit 75387b92635e7dca410c1ef92cfe510019248f76 Author: Kristina Martsenko <kristina.martsenko@xxxxxxx> Date: Wed Dec 13 17:07:21 2017 +0000 arm64: handle 52-bit physical addresses in page table entries The top 4 bits of a 52-bit physical address are positioned at bits 12..15 of a page table entry. Introduce macros to convert between a physical address and its placement in a table entry, and change all macros/functions that access PTEs to use them. With 64k page and non-52-bit kernel, it looks like the bits 12..15 are zero, so we can move the zeros to bits 49..51 because the zeros don't affect the PA, for example: 52-bit non-52-bit (48-bit) PTE 0x0000123456789000 0x0000123456780000 v--------^ v--------^ __pte_to_phys() w/52-bit support PA 0x0009123456780000 0x0000123456780000 I think that this was what the upstream maintainers said.. But "if (lpa_52_bit_support_available)" is also fine for me. >>>> @@ -287,6 +481,15 @@ get_stext_symbol(void) >>>> int >>>> get_machdep_info_arm64(void) >>>> { >>>> + int pa_bits; >>>> + >>>> + /* Determine if the PA address range is 52-bits: ARMv8.2-LPA */ >>>> + pa_bits = get_phy_mask_shift_arm64(); >>>> + DEBUG_MSG("pa_bits : %d\n", pa_bits); >>>> + >>>> + if (pa_bits == 52) >>>> + lpa_52_bit_support_available = 1; >>>> + >> >> This looks a bit redundant, so for example >> >> if (NUMBER(MAX_PHYSMEM_BITS) != NOT_FOUND_NUMBER) { >> info->max_physmem_bits = NUMBER(MAX_PHYSMEM_BITS); >> if (info->max_physmem_bits == 52) >> lpa_52_bit_support_available = 1; >> } else >> info->max_physmem_bits = 48; > > Ok. > >>>> @@ -425,14 +628,13 @@ vaddr_to_paddr_arm64(unsigned long vaddr) >>>> ERRMSG("Can't get a valid pte.\n"); >>>> return NOT_PADDR; >>>> } else { >>>> - >>>> - paddr = (PAGEBASE(pte_val(ptev)) & PHYS_MASK) >>>> + paddr = (PAGEBASE(pte_val(ptev)) & PTE_ADDR_MASK) >>>> + (vaddr & (PAGESIZE() - 1)); >> >> I think __pte_to_phys() is needed also here, not PTE_ADDR_MASK. > > I had some issues with the same on ARMv8 simulator, so lets stick with the tested 'PTE_ADDR_MASK' usage for now. Did you test this PTE_ADDR_MASK line on a system actually using 52-bit PA? If a 52-bit physical address is actually used, this will return a wrong address, for example: PTE_ADDR_MASK 0x0000fffffffff000 PTE 0x0000123456789000 PAGEBASE'd 0x0000123456780000 v paddr 0x0000123456780000 + 64k offset // incorrect With 52-bit PA, the PTE_ADDR_MASK is used for __phys_to_pte_val(), not __pte_to_phys(). If I understand correctly, we should need __pte_to_phys() also here. paddr = __pte_to_phys(ptev) + (vaddr & (PAGESIZE() - 1)); Could you try this? Thanks, Kazu _______________________________________________ kexec mailing list kexec@xxxxxxxxxxxxxxxxxxx http://lists.infradead.org/mailman/listinfo/kexec