On 2014/01/31 18:58:08, HATAYAMA Daisuke <d.hatayama at jp.fujitsu.com> wrote: > (2014/01/31 11:36), Atsushi Kumagai wrote: > > Hello HATAYAMA-san, > > > > On 2013/12/09 17:06:18, HATAYAMA Daisuke <d.hatayama at jp.fujitsu.com> wrote: > >> This is a patch for fixing mmap failure due to fractional page issue. > >> > >> This patch might be still a bit too large as a single patch and might need to split. > >> If you think patch refactoring is needed, please suggest. > >> > >> Change Log: > >> > >> v1 => v2) > >> > >> - Copy fractional pages from 1st kernel to 2nd kernel to reduce read > >> to the fractional pages for reliability. > >> > >> - Deal with the case where multiple System RAM areas are contained in > >> a single fractional page. > >> > >> Test: > >> > >> Tested on X86_64. Fractional pages are created using memmap= kernel > >> parameter on the kdump 1st kernel. > > > > Could you tell me more details about how to reproduce this ? > > I tried to create such fractional pages to test the patch at > > the end of this mail, by using memmap= kernel parameter as you said > > like below: > > > > # cat /proc/iomem > > ... > > 100000000-10fff57ff : System RAM > > 10fff5800-1100057ff : reserved > > 110005800-11fffffff : System RAM > > > > However, I couldn't face the mmap() failure and makedumpfile worked > > normally even using mmap() on linux-3.12.1. What am I missing here ? > > > > This patch set tries to reduce potential risk on accessing i.e. creating > page tables reading memory outside System RAM regions. The potential risk > I intend here is for example effect of accessing mmio region. > > If you didn't see any failure except for mmap() failure on fractional pages, > there's no potential risk on your system in the sense of what I mean above. > Or you could probably see different behavior by choosing other System RAM > region that resides in the memory that is used for something special. Thanks for your response, but I couldn't see even the mmap() failure caused by a sanity check in remap_pfn_range(). I'll describe what I did for debugging as below. First, the 1st kernel's memory map I prepared and its PT_LOAD are here. # cat /proc/iomem ... 100000000-10fff57ff : System RAM 10fff5800-1100057ff : ACPI Tables 110005800-11fffffff : System RAM ... Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align LOAD 0x00000000d07ca000 0xffff880100000000 0x0000000100000000 0x000000000fff5800 0x000000000fff5800 RWE 0 LOAD 0x00000000e07c0800 0xffff880110005800 0x0000000110005800 0x000000000fffa800 0x000000000fffa800 RWE 0 The fractional page I expected was [0x10fff5000 - 0x10fff57ff], so its file offset was [0xe07bf000 - 0xe07bf7ff]. Second, I prepared a patch to make sure whether the fractional page was mapped with mmap() or not as below: diff --git a/makedumpfile.c b/makedumpfile.c index 7536274..b6abd31 100644 --- a/makedumpfile.c +++ b/makedumpfile.c @@ -251,11 +251,16 @@ update_mmap_range(off_t offset, int initial) { map_size = MIN(max_offset - start_offset, info->mmap_region_size); + if (start_offset <= 0xe07bf000 && 0xe07bf000 <= (start_offset + map_size)) { + MSG("Try mapping [%llx-%llx] with mmap()\n", + (ulonglong)start_offset, + (ulonglong)(start_offset + map_size)); + } + info->mmap_buf = mmap(NULL, map_size, PROT_READ, MAP_PRIVATE, info->fd_memory, start_offset); Finally, I run makedumpfile_v1.5.5 with only the debug patch above in 2nd kernel (linux-3.12.1): # makedumpfile -D -c /proc/vmcore ./dumpfile.c ... mmap() is available on the kernel. Copying data : [ 92.9 %] -Try mapping [e03f9000-e07f9000] with mmap() Copying data : [100.0 %] | Writing erase info... offset_eraseinfo: 12ad1218, size_eraseinfo: 0 The dumpfile is saved to ./dumpfile.c. makedumpfile Completed. # According to this result, mmap() for the fractional page seemed to succeed even without any fix, so I suspect that I misunderstand something about the mmap() issue reported by Vivek. Perhaps, can a fractional page pass that sanity check depending on the situation ? At least, I confirmed that the my patch I sent truncates mmap() regions as I expected with the debug patch above, so I think there is no problem with it. Thanks Atsushi Kumagai > Also, in early phase, our design didn't care about this kind of fractional pages > because we don't think there were many such systems on real world. But the > bug report came earlier than we expected. So, I think we should design > carefully here around at least as long as they can be done relatively simply. > # Sorry for delaying my work... > > Of course, I think both kernel and makedumpfile address the issue together > to reduce potential risk as much as possible. > > -- > Thanks. > HATAYAMA, Daisuke