From: Atsushi Kumagai <kumagai-atsushi@xxxxxxxxxxxxxxxxx> Subject: Re: [PATCH 00/13] kdump, vmcore: support mmap() on /proc/vmcore Date: Fri, 15 Feb 2013 12:57:01 +0900 > On Thu, 14 Feb 2013 19:11:43 +0900 > HATAYAMA Daisuke <d.hatayama at jp.fujitsu.com> wrote: <cut> >> TODO >> ==== >> >> - fix makedumpfile to use mmap() on /proc/vmcore and benchmark it to >> confirm whether we can see enough performance improvement. > > As a first step, I'll make a prototype patch for benchmarking unless you > have already done it. > I have an idea, but I've not started developing it yet. I think threre are the two points we should optimize. One is write_kdump_pages() that reads target page frames, compresses them if necessary, and writes each page frame data in order, and the other is __exclude_unnecessary_pages() that reads mem_map array into page_cache and processes it for filtering. Optimising the former seems trivial by mmap(), but we have to consider more for the latter case since it is virtually contiguous but not guranteed to be physically contiguous; mem_map is mapped in the virtual memory map region. Hence, the current implementation reads mem_map array one by one in 4KB page with virtual-to-physical translation. This is critical in performance and not sutable for optimization by mmap(). We should fix this anyway. My idea here is to focus on the fact that virtual memory map region is actually mapped using PMD level page entry, i.e. 4MB page, if currently used processor supports large pages. By this, the page entries gained by each page translation is guranteed to be physically contiguous in at least 4MB length. Looking at the benchmark, the performance improvement is already saturated in 4MB case. So I guess we can see enough performance improvement by mmap()ing mem_map array in this 4MB page units. Thanks. HATAYAMA, Daisuke