On Tue, Dec 17, 2013 at 10:20:07AM -0600, Alex Thorlton wrote: > This message in particular: > > https://lkml.org/lkml/2013/8/2/697 I think adding a prctl (or similar) inherited by child to turn off THP would be a fine addition to the current madvise. So you can then run any static app under a wrapper like "THP_disable ./whatever" The idea is, if the software is maintained, madvise allows for finegrined optimization, if the software is legacy proprietary statically linked (or if it already uses LD_PRELOAD for other things), prctl takes care of that in a more coarse way (but still per-app). > The thread I mention above originally proposed a per-process switch to > disable THP without the use of madvise, but it was not very well > received. I'm more than willing to revisit that idea, and possibly I think you provided enough explanation of why it is needed (static binaries, proprietary apps, annoyance of LD_PRELOAD that may collide with other LD_PRELOAD in proprietary apps whatever), so I think a prctl is reasonable addition to the madvise. We also have an madvise to turn on THP selectively on embedded that may boot with enabled=madvise to be sure not to waste any memory because of THP. But the prctl to selectively enable doesn't make too much sense, as one has to selectively enabled in a finegrined way to be sure not to cause any memory waste. So I think a NOHUGEPAGE prctl would be enough. > meld the two (a per-process threshold, instead of a big-hammer on-off > swtich). Let me know if that seems preferable to this idea and we can > discuss. The per-process threshold would be much bigger patch, I think starting with the big-hammer on-off is preferable as it is much simpler and it should be more than enough to take care of the rare corner cases, while leaving the other workloads unaffected (modulo the cacheline to check the task or mm flags) running at max speed. To evaluate the threshold solution, a variety of benchmarks of a multitude of apps would be necessary first, to see the effect it has on the non-corner cases. Adding the big-hammer on-off prctl instead is a black and white design solution that won't require black magic settings. Ideally if we add a threshold later it won't require any more cacheline accesses, as the threshold would also need to be per-task or per-mm so the runtime cost of the prctl would be zero then and it could then become a benchmarking tweak even if we add the per-app threshold later. About creating heuristics to automatically detect the ideal value of the big-hammer per-app on/off switch (or even harder the ideal value of the per-app threshold), I think it's not going to happen because there are too few corner cases and it wouldn't be worth the cost of it (the cost would be significant no matter how implemented). Every time we try to make THP smarter at auto-disabling itself for the corner cases, we're slowing it down for everyone that gets a benefit from it, and there's no way around it. This is why I think the big-hammer prctl for the few corner cases is the best way to go. Thanks! Andrea -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>