On Mon, 3 Jun 2013 11:59:40 -0400 Vivek Goyal <vgoyal at redhat.com> wrote: > On Mon, Jun 03, 2013 at 03:27:18PM +0200, Michael Holzheu wrote: > > [..] > > > If not, how would remap_pfn_range() work with HSA region when > > > /proc/vmcore is mmaped()? > > > > I am no memory management expert, so I discussed that with Martin > > Schwidefsky (s390 architecture maintainer). Perhaps something like > > the following could work: > > > > After vmcore_mmap() is called the HSA pages are not initially > > mapped in the page tables. So when user space accesses those parts > > of /proc/vmcore, a fault will be generated. We implement a mechanism > > that in this case the HSA is copied to a new page in the page cache > > and a mapping is created for it. Since the page is allocated in the > > page cache, it can be released afterwards by the kernel when we get > > memory pressure. > > > > Our current idea for such an implementation: > > > > * Create new address space (struct address_space) for /proc/vmcore. > > * Implement new vm_operations_struct "vmcore_mmap_ops" with > > new vmcore_fault() ".fault" callback for /proc/vmcore. > > * Set vma->vm_ops to vmcore_mmap_ops in mmap_vmcore(). > > * The vmcore_fault() function will get a new page cache page, > > copy HSA page to page cache page add it to vmcore address space. > > To see how this could work, we looked into the functions > > filemap_fault() in "mm/filemap.c" and relay_buf_fault() in > > "kernel/relay.c". > > > > What do you think? > > I am not mm expert either but above proposal sounds reasonable to me. > > So remap_pfn_range() call will go in arch dependent code so that arch > can decide which range can be mapped right away and which ranges will > be filed in when fault happens? I am assuming that s390 will map > everything except for pfn between 0 and HSA_SIZE. Yes, for [0 - HSA_SIZE] the fault handler will be called and for the rest we establish a mapping with remap_pfn_range() as it is currently done. Therefore no fault handler will be called for that part of /proc/vmcore. I will try to find out if it is doable that way. > And regular s390 kdump will map everyting right away and will not > have to rely on fault mechanism? Yes, as kdump on the other archs. Thanks Michael