On Tue, Feb 18, 2014 at 10:28:11AM -0800, Linus Torvalds wrote: > On Tue, Feb 18, 2014 at 10:07 AM, Kirill A. Shutemov > <kirill.shutemov@xxxxxxxxxxxxxxx> wrote: > > > > Patch is wrong. Correct one is below. > > Hmm. I don't hate this. Looking through it, it's fairly simple > conceptually, and the code isn't that complex either. I can live with > this. > > I think it's a bit odd how you pass both "max_pgoff" and "nr_pages" to > the fault-around function, though. In fact, I'd consider that a bug. > Passing in "FAULT_AROUND_PAGES" is just wrong, since the code cannot - > and in fact *must* not - actually fault in that many pages, since the > starting/ending address can be limited by other things. > > So I think that part of the code is bogus. You need to remove > nr_pages, because any use of it is just incorrect. I don't think it > can actually matter, since the max_pgoff checks are more restrictive, > but if you think it can matter please explain how and why it wouldn't > be a major bug? I don't like this too... Current max_pgoff is end of page table (or end of vma, if it ends before). If we drop nr_pages but keep current max_pgoff, we will potentially setup PTRS_PER_PTE pages a time: i.e. page fault to first page of page table and all pages are ready. nr_pages limits the number. It's not necessary bad idea to populate whole page table at once. I need to measure how much latency we will add by doing that. The only problem I see is that we take ptl for a bit too long. But with split ptl it will affect only page table we populate. Other approach is too limit ourself to FAULT_AROUND_PAGES from start_addr. In this case sometimes we will do useless radix-tree lookup even if we had chance to populated pages further in the page table. > Apart from that, I'd really like to see numbers for different ranges > of FAULT_AROUND_ORDER, because I think 5 is pretty high, but on the > whole I don't find this horrible, and you still lock the page so it > doesn't involve any new rules. I'm not hugely happy with another raw > radix-tree user, but it's not horrible. > > Btw, is the "radix_tree_deref_retry(page) -> goto restart" really > necessary? I'd be almost more inclined to just make it just do a > "break;" to break out of the loop and stop doing anything clever at > all. The code has not ready yet. I'll rework it. It just what I had by the end of the day. I wanted to know if setup pte directly from ->fault_nonblock() is okayish approach or considered layering violation. -- Kirill A. Shutemov -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>