On Mon, Oct 09, 2017 at 01:07:36PM +0530, Anshuman Khandual wrote: > On 10/09/2017 11:14 AM, Aaron Lu wrote: > > __rmqueue() is called by rmqueue_bulk() and rmqueue() under zone->lock > > and that lock can be heavily contended with memory intensive applications. > > > > Since __rmqueue() is a small function, inline it can save us some time. > > With the will-it-scale/page_fault1/process benchmark, when using nr_cpu > > processes to stress buddy: > > > > On a 2 sockets Intel-Skylake machine: > > base %change head > > 77342 +6.3% 82203 will-it-scale.per_process_ops > > > > On a 4 sockets Intel-Skylake machine: > > base %change head > > 75746 +4.6% 79248 will-it-scale.per_process_ops > > > > This patch adds inline to __rmqueue(). > > > > Signed-off-by: Aaron Lu <aaron.lu@xxxxxxxxx> > > Ran it through kernel bench and ebizzy micro benchmarks. Results > were comparable with and without the patch. May be these are not > the appropriate tests for this inlining improvement. Anyways it I think so. The benefit only appears when the lock contention is huge enough, e.g. perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath is as high as 80% with the workload I have used. > does not have any performance degradation either. > > Reviewed-by: Anshuman Khandual <khandual@xxxxxxxxxxxxxxxxxx> > Tested-by: Anshuman Khandual <khandual@xxxxxxxxxxxxxxxxxx> Thanks! -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>