On Tue, 11 Feb 2014 05:05:55 +0200 "Kirill A. Shutemov" <kirill.shutemov@xxxxxxxxxxxxxxx> wrote: > Okay, it's RFC only. I haven't stabilize it yet. And it's 5 AM... > > It kind of work on small test-cases in kvm, but hung my laptop shortly > after boot. So no benchmark data. > > The patches are on top of mine __do_fault() cleanup. > > The idea is to minimize number of minor page faults by mapping pages around > the fault address if they are already in page cache. > > With the patches we try to map up to 32 pages (subject to change) on read > page fault. Later can extended to write page faults to shared mappings if > works well. > > The pages must be on the same page table so we can change all ptes under > one lock. > > I tried to avoid additional latency, so we don't wait page to get ready, > just skip to the next one. > > The only place where we can get stuck for relatively long time is > do_async_mmap_readahead(): it allocates pages and submits IO. We can't > just skip readahead, otherwise it will stop working and we will get miss > all the time. On other hand keeping do_async_mmap_readahead() there will > probably break readahead heuristics: interleaving access looks as > sequential. > hm, we tried that a couple of times, many years ago. Try https://www.google.com/#q="faultahead" then spend a frustrating hour trying to work out what went wrong. Of course, the implementation might have been poor and perhaps we can get this to work. It would seem to make most sense to tie the faultahead into linear reads of mmapped files. The disk readahead code already tries to recognise and optimise such read patterns, but tying faultahead into readahead won't work well because the pages will often already be in pagecache. A starting point for this work would be to get all the tracepoints in place and then perform some analysis of what the access patterns really look like. Based on that (statistical) analysis we can then design a feature to optimise it and make some predictions about how effective it might be. I have vague memories of writing code which, within the first fault would read the entire file into pagecache and then mapped everything. It was really fast (mainly from linearising the read of executables and libraries) but was wasteful and unserious. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>