On Sat, Jun 2, 2018 at 3:11 AM, Bhupesh Sharma <bhsharma@xxxxxxxxxx> wrote: > On 05/31/2018 10:21 AM, Bhupesh Sharma wrote: >> >> Hi Ard, >> >> Sorry I was out for most of the day yesterday. Please see my responses >> inline. >> >> On Mon, May 28, 2018 at 12:16 PM, Ard Biesheuvel >> <ard.biesheuvel@xxxxxxxxxx> wrote: >>> >>> On 27 May 2018 at 23:03, Bhupesh Sharma <bhsharma@xxxxxxxxxx> wrote: >>>> >>>> Hi ARM64 maintainers, >>>> >>>> I am confused about the PAGE_OFFSET value (or the start of the linear >>>> map) on a KASLR enabled ARM64 kernel that I am seeing on a board which >>>> supports a compatible EFI firmware (with EFI_RNG_PROTOCOL support). >>>> >>>> 1. 'arch/arm64/include/asm/memory.h' defines PAGE_OFFSET as: >>>> >>>> /* >>>> * PAGE_OFFSET - the virtual address of the start of the linear map >>>> (top >>>> * (VA_BITS - 1)) >>>> */ >>>> #define PAGE_OFFSET (UL(0xffffffffffffffff) - \ >>>> (UL(1) << (VA_BITS - 1)) + 1) >>>> >>>> So for example on a platform with VA_BITS=48, we have: >>>> PAGE_OFFSET = 0xffff800000000000 >>>> >>>> 2. However, for the KASLR case, we set the 'memstart_offset_seed ' to >>>> use the 16-bits of the 'kaslr-seed' to randomize the linear region in >>>> 'arch/arm64/kernel/kaslr.c' : >>>> >>>> u64 __init kaslr_early_init(u64 dt_phys) >>>> { >>>> <snip..> >>>> /* use the top 16 bits to randomize the linear region */ >>>> memstart_offset_seed = seed >> 48; >>>> <snip..> >>>> } >>>> >>>> 3. Now, we use the 'memstart_offset_seed' value to randomize the >>>> 'memstart_addr' value in 'arch/arm64/mm/init.c': >>>> >>>> void __init arm64_memblock_init(void) >>>> { >>>> <snip..> >>>> >>>> if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) { >>>> extern u16 memstart_offset_seed; >>>> u64 range = linear_region_size - >>>> (memblock_end_of_DRAM() - memblock_start_of_DRAM()); >>>> >>>> /* >>>> * If the size of the linear region exceeds, by a sufficient >>>> * margin, the size of the region that the available physical >>>> * memory spans, randomize the linear region as well. >>>> */ >>>> if (memstart_offset_seed > 0 && range >= ARM64_MEMSTART_ALIGN) >>>> { >>>> range = range / ARM64_MEMSTART_ALIGN + 1; >>>> memstart_addr -= ARM64_MEMSTART_ALIGN * >>>> ((range * memstart_offset_seed) >> 16); >>>> } >>>> } >>>> <snip..> >>>> } >>>> >>>> 4. Since 'memstart_addr' indicates the start of physical RAM, we >>>> randomize the same on basis of 'memstart_offset_seed' value above. >>>> Also the 'memstart_addr' value is available in '/proc/kallsyms' and >>>> hence can be accessed by user-space applications to read the >>>> 'memstart_addr' value. >>>> >>>> 5. Now since the PAGE_OFFSET value is also used by several user space >>>> tools (for e.g. makedumpfile tool uses the same to determine the start >>>> of linear region and hence to read PT_NOTE fields from /proc/kcore), I >>>> am not sure how to read the randomized value of the same in the KASLR >>>> enabled case. >>>> >>>> 6. Reading the code further and adding some debug prints, it seems the >>>> 'memblock_start_of_DRAM()' value is more closer to the actual start of >>>> linear region rather than 'memstart_addr' and 'PAGE_OFFSET" in case of >>>> KASLR enabled kernel: >>>> >>>> [root@qualcomm-amberwing] # dmesg | grep -i "arm64_memblock_init" -A 5 >>>> >>>> [ 0.000000] inside arm64_memblock_init, memstart_addr = >>>> ffff976a00000000, >>>> linearstart_addr = ffffe89600200000, memblock_start_of_DRAM = >>>> ffffe89600200000, >>>> PHYS_OFFSET = ffff976a00000000, PAGE_OFFSET = ffff800000000000, >>>> KIMAGE_VADDR = ffff000008000000, kimage_vaddr = ffff20c2d7800000 >>>> >>>> [root@qualcomm-amberwing] # dmesg | grep -i "Virtual kernel memory >>>> layout" -A 15 >>>> [ 0.000000] Virtual kernel memory layout: >>>> [ 0.000000] modules : 0xffff000000000000 - 0xffff000008000000 >>>> ( 128 MB) >>>> [ 0.000000] vmalloc : 0xffff000008000000 - 0xffff7bdfffff0000 >>>> (126847 GB) >>>> [ 0.000000] .text : 0xffff20c2d7880000 - 0xffff20c2d8040000 >>>> ( 7936 KB) >>>> [ 0.000000] .rodata : 0xffff20c2d8040000 - 0xffff20c2d83a0000 >>>> ( 3456 KB) >>>> [ 0.000000] .init : 0xffff20c2d83a0000 - 0xffff20c2d8750000 >>>> ( 3776 KB) >>>> [ 0.000000] .data : 0xffff20c2d8750000 - 0xffff20c2d891b200 >>>> ( 1837 KB) >>>> [ 0.000000] .bss : 0xffff20c2d891b200 - 0xffff20c2d90a5198 >>>> ( 7720 KB) >>>> [ 0.000000] fixed : 0xffff7fdffe790000 - 0xffff7fdffec00000 >>>> ( 4544 KB) >>>> [ 0.000000] PCI I/O : 0xffff7fdffee00000 - 0xffff7fdfffe00000 >>>> ( 16 MB) >>>> [ 0.000000] vmemmap : 0xffff7fe000000000 - 0xffff800000000000 >>>> ( 128 GB maximum) >>>> [ 0.000000] 0xffff7ffa25800800 - 0xffff7ffa2b800000 >>>> ( 95 MB actual) >>>> [ 0.000000] memory : 0xffffe89600200000 - 0xffffe8ae00000000 >>>> ( 98302 MB) >>>> >>>> As one can see above, the 'memblock_start_of_DRAM()' value of >>>> 0xffffe89600200000 represents the start of linear region: >>>> >>>> [ 0.000000] memory : 0xffffe89600200000 - 0xffffe8ae00000000 >>>> ( 98302 MB) >>>> >>>> So, my question is to access the start of linear region (which was >>>> earlier determinable via PAGE_OFFSET macro), whether I should: >>>> >>>> - do some back-computation for the start of linear region from the >>>> 'memstart_addr' in user-space, or >>>> - use a new global variable in kernel which is assigned the value of >>>> memblock_start_of_DRAM()' and assign it to '/proc/kallsyms', so that >>>> it can be read by user-space tools, or >>>> - whether we should rather look at removing the PAGE_OFFSET usage from >>>> the kernel and replace it with a global variable instead which is >>>> properly updated for KASLR case as well. >>>> >>>> Kindly share your opinions on what can be a suitable solution in this >>>> case. >>>> >>>> Thanks for your help. >>>> >>> >>> Hello Bhupesh, >>> >>> Could you explain what the relevance is of PAGE_OFFSET to userland? >>> The only thing that should matter is where the actual linear mapping >>> of DRAM is, and I am not sure I understand why we care about where it >>> resides relative to the base of the linear region. >> >> >> Actually certain user-space tools like makedumpfile (which is used to >> generate and compress the vmcore) and crash-utility (which is used to >> debug the vmcore), rely on the PAGE_OFFSET value (which denotes the >> base of the linear map region) to determine virtual to physical >> mapping of the addresses lying in the linear region . >> >> One specific use case that I am working on at the moment is the >> makedumpfile '--mem-usage', which allows one to see the page numbers >> of current system (1st kernel) in different use (please see >> MAKEDUMPFILE(8) for more details). >> >> Using this we can know how many pages are dumpable when different >> dump_level is specified when invoking the makedumpfile. >> >> Normally, makedumpfile analyses the contents of '/proc/kcore' (while >> excluding the crashkernel range), and then calculates the page number >> of different kind per vmcoreinfo. >> >> For e.g. here is an output from my arm64 board (a non KASLR boot): >> >> TYPE PAGES EXCLUDABLE DESCRIPTION >> >> ---------------------------------------------------------------------- >> ZERO 49524 yes Pages >> filled with zero >> NON_PRI_CACHE 15143 yes Cache >> pages without private flag >> PRI_CACHE 29147 yes Cache >> pages with private flag >> USER 3684 yes User process >> pages >> FREE 1450569 yes Free pages >> KERN_DATA 14243 no Dumpable >> kernel data >> >> page size: 65536 >> Total pages on system: 1562310 >> Total size on system: 102387548160 Byte >> >> This use case requires directly reading the '/proc/kcore' and the >> hence the PAGE_OFFSET value is used to determine the base address of >> the linear region, whose value is not static in case of KASLR boot. >> >> Another use-case is where the crash-utility uses the PAGE_OFFSET value >> to perform a virtual-to-physical conversion for the address lying in >> the linear region: >> >> ulong >> arm64_VTOP(ulong addr) >> { >> if (machdep->flags & NEW_VMEMMAP) { >> if (addr >= machdep->machspec->page_offset) >> return machdep->machspec->phys_offset >> + (addr - machdep->machspec->page_offset); >> >> <..snip..> >> } >> > > Another confusing concept is the rounded-up value of 'memstart_addr' in > 'arch/arm64/mm/init.c' when booting a non-KASLR_ kernel and when the value > of memblock_start_of_DRAM() < ARM64_MEMSTART_ALIGN: > > void __init arm64_memblock_init(void) > { > > <..snip..> > /* > * Select a suitable value for the base of physical memory. > */ > memstart_addr = round_down(memblock_start_of_DRAM(), > ARM64_MEMSTART_ALIGN); > <..snip..> > } > > For example, let's consider a case (which I see on my qualcomm board) where > memblock_start_of_DRAM() = 0x200000 and ARM64_MEMSTART_ALIGN = 0x40000000 (I > am using VA_BITS = 48 and a 64K page size), in this case > memstart_addr is calculated at 0, as the round_down results in a value of 0. > > This is in contrast with the definition of the 'memblock_start_of_DRAM': > > /* lowest address */ > phys_addr_t __init_memblock memblock_start_of_DRAM(void) > { > return memblock.memory.regions[0].base; > } > > As indicated by logs below, the first memblock region base starts from > 0x200000 rather than the 'memstart_addr' value (which is 0) > > # dmesg | grep -i "Processing" -A 5 > [ 0.000000] efi: Processing EFI memory map: > [ 0.000000] efi: 0x000000200000-0x00000021ffff [Runtime Data |RUN| | > | | | | | |WB|WT|WC|UC] > [ 0.000000] efi: 0x000000400000-0x0000005fffff [ACPI Memory NVS | | > | | | | | | | | | |UC] > > # head -1 /proc/iomem > 00200000-0021ffff : reserved > > Since we define 'PHYS_OFFSET' as the physical address of the start of memory > it would be 0 in this case: > > /* PHYS_OFFSET - the physical address of the start of memory. */ > #define PHYS_OFFSET ({ VM_BUG_ON(memstart_addr & 1); > memstart_addr; }) > > On the other hand, the first memblock starts from 0x200000, so my question > is whether we should update the user-space tools which use the memblocks > listed in '/proc/iomem' to obtain the value of PHY_OFFSET (by reading the > base of the 1st memblock) and read the value of 'memstart_addr' somehow in > user-space to get the PHY_OFFSET, or should the change be done at the kernel > end to calculate 'memstart_addr' as: > > > /* > * Select a suitable value for the base of physical memory. > */ > memstart_addr = round_down(memblock_start_of_DRAM(), > ARM64_MEMSTART_ALIGN); > if (memstart_addr) Sorry for the typo: I meant if (!memstart_addr) above Regards, Bhupesh > memstart_addr = memblock_start_of_DRAM(); > > Please share your views. > > Thanks, > Bhupesh _______________________________________________ kexec mailing list kexec@xxxxxxxxxxxxxxxxxxx http://lists.infradead.org/mailman/listinfo/kexec