On Wed, 4 Mar 2015 13:44:18 +0100 Michael Holzheu <holzheu at linux.vnet.ibm.com> wrote: > On Tue, 3 Mar 2015 11:07:50 +0100 > Petr Tesarik <ptesarik at suse.cz> wrote: > > > On Tue, 3 Mar 2015 10:15:43 +0100 > > Michael Holzheu <holzheu at linux.vnet.ibm.com> wrote: > > [snip] > > > > I did a quick test with your patch and it looks like the mmap mode > > > on my s390 system is slower than the read mode: > > > > That's sad. OTOH I had similar results on a file mmap some time ago. > > The cost of copying data was less than the cost of handling a series of > > minor page faults. > > I think we understood the problem: As for the read path, also for mmap > the memory is copied into a temporary buffer: > > static int read_with_mmap(off_t offset, void *bufptr, ...) > { > > ... > memcpy(bufptr, info->mmap_buf + > (offset - info->mmap_start_offset), read_size); > > > Because on s390 copy_to_user() is as fast as userspace memcpy() we > don't have any benefit here. The only saving is due to less > mmap()/munmap() than read() system calls because bigger chunks > are mapped than read. > > If you specify -d 31 the dump memory is fragmented and we have to > issue more mmap()/munmap() calls and therefore also the system > call overhead increases. > > If we really want to speed up the mmap path on s390 we probably > have to get rid of the temporary buffer. > > What do you think? I'm not sure. Clearly, we should get rid of the temporary buffer. OTOH this slow-down should be observed on all architectures, not just s390. Now, mmap should have been implemented in the cache code, not above it. Since I wrote the cache, this task is probably up to me. Stay tuned, Petr T