On Mon, 19 Dec 2016 14:58:26 -0800 Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx> wrote: > I saw a 4.8->4.9 regression (details below) that I attributed to: > > 9dcb8b685f mm: remove per-zone hashtable of bitlock waitqueues > > That commit took the bitlock waitqueues from being dynamically-allocated > per-zone to being statically allocated and global. As suggested by > Linus, this makes them per-node, but keeps them statically-allocated. > > It leaves us with more waitqueues than the global approach, inherently > scales it up as we gain nodes, and avoids generating code for > page_zone() which was evidently quite ugly. The patch is pretty darn > tiny too. > > This turns what was a ~40% 4.8->4.9 regression into a 17% gain over > what on 4.8 did. That gain is a _bit_ surprising, but not entirely > unexpected since we now get much simpler code from no page_zone() and a > fixed-size array for which we don't have to follow a pointer (and get to > do power-of-2 math). I'll have to respin the PageWaiters patch and resend it. There were just a couple of small issues picked up in review. I've just got side tracked with getting a few other things done and haven't had time to benchmark it properly. I'd still like to see what per-node waitqueues does on top of that. If it's significant for realistic workloads then it could be done for the page waitqueues as Linus said. Thanks, Nick -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>