On Tue, Dec 27, 2016 at 11:40 AM, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote: > > This patch at least might have a chance in hell of working. Let's see.. Ok, with that fixed, things do indeed seem to work. And things also look fairly good on my "lots of nasty little shortlived scripts" benchmark ("make -j32 test" for git, in case people care). That benchmark used to have "unlock_page()" and "__wake_up_bit()" together using about 3% of all CPU time. Now __wake_up_bit() doesn't show up at all (ok, it's something like 0.02%, so it's technically still there, but..) and "unlock_page()" is at 0.66% of CPU time. So it's about a quarter of where it used to be. And now it's about the same cost as the "try_lock_page() that is inlined into filemap_map_pages() - it used to be that unlocking the page was much more expensive than locking it because of all the unnecessary waitqueue games. So the benchmark still does a ton of page lock/unlock action, but it doesn't stand out in the profiles as some kind of WTF thing any more. And the profiles really show that the cost is the atomic op itself rather than bad effects from bad code generation, which is what you want to see. Would I love to fix this all by not taking the page lock at all? Yes I would. I suspect we should be able to do something clever and lockless at least in theory. But in the meantime, I'm happy with where our page locking overhead is. And while I haven't seen the NUMA numbers from Dave Hansen with this all, the early testing from Dave was that the original patch from Nick already fixed the regression and was the fastest one anyway. And this optimization will only have improved on things further, although it might not be as noticeable on NUMA as it is on just a regular single socket system. Linus -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>