On 12/20/13 at 05:49pm, HATAYAMA Daisuke wrote: > (2013/12/20 11:17), Dave Young wrote: > >>Also, I'm interested in the memory map passed to from EFI in that > >> > >>>cat /proc/iomem: > >>>00000000-00000fff : reserved > >>>00001000-0009ffff : System RAM > >>>000a0000-000bffff : PCI Bus 0000:00 > >>>000f0000-000fffff : System ROM > >>>00100000-3d162017 : System RAM > >>> 01000000-015cab9b : Kernel code > >>> 015cab9c-019beb3f : Kernel data > >>> 01b4f000-01da9fff : Kernel bss > >>> 30000000-37ffffff : Crash kernel > >>>3d162018-3d171e57 : System RAM > >>>3d171e58-3d172017 : System RAM > >>>3d172018-3d17ae57 : System RAM > >>>3d17ae58-3dc10fff : System RAM > >> > >>this part is consecutive but somehow is divided into 4 entries. > >>You called your environment as ``EFI virtual machine'', could you tell > >>me precisely what it mean? qemu/KVM or VMware guest system? I do want > >>to understand how this kind of memory map was created. I think this > >>kind of memory mapping is odd and I guess this is caused by the fact > >>that the system is a virtual environment. > > > >This is not specific to EFI machine, it's the reserved setup_data regions > >They happened to be continous but they do not have to be continuous. > > > > Thanks for pointing out that. I've just read Documentation/x86/boot.txt > and parse_setup_data(). > > But I don't understand well why these regions are divided as these. > I guess kernel divides the System RAM this way and the memory map first > passed to by EFI is all page aligned, right? setup_data are passed as a link list by boot loader, each node is a block of memory, there could be many different setup_data type. > > Also, looking at parse_setup_data(), currently handled data in setup_data > interface is extended e820 entries and dtb case only. > > switch (data_type) { > case SETUP_E820_EXT: > parse_e820_ext(pa_data, data_len); > break; > case SETUP_DTB: > add_dtb(pa_data); > break; > default: > break; > } > > Is it right that this kind of memory map doesn't occur as long as either > of information is passed to via setup_data? IOW, is this necessary > information? If bootloader does not pass it, there will be no such mem ranges in /proc/iomem. > > >> > >>And for Vivek, this case is a concrete example of multiple RAM entries > >>appearing in a single page I suspected in the mmap failure patch, > >>although these entries are consecutive in physical address and can be > >>represented by a single entry by merging them in a single entry. But > >>then it seems to me that there could be more odd case that multiple > >>RAM entries but not consecutive. I again think this should be addressed > >>in the patch for the mmap failure issue. How do you think? > > > >They are different problems, the previous mmap bug is for cross page regions > >with different page flags. > > > > I understand that. What I think problem here is the case where multiple > System RAM entries appear in a single page. In the above memory map, they > are 3d171000, 3d172000 and 3d17a000. My fixing patch is to copy fractional > pages in the 2nd kernel in order to make it possible to mmap without affecting > non-System RAM area as much as possible, and then if there is this kind of > System RAM entries, we need to use the same page in the 2nd kernel for > different System RAM entries that shares the same page in the 1st kernel. This > needs a little additional processing and we want to keep implementation as > simple as possible as long as there's no such system in real world. However, > I'm surprised to see the memory mapping above. These ranges are "system ram" of type E820_RESERVED_KERN, please see below: arch/x86/kernel/setup.c: e820_reserve_setup_data() Thanks Dave