Re: Page allocator bottleneck

Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> · Fri, 3 Nov 2017 13:40:20 +0000

On Thu, Nov 02, 2017 at 07:21:09PM +0200, Tariq Toukan wrote:
> 
> 
> On 18/09/2017 12:16 PM, Tariq Toukan wrote:
> > 
> > 
> > On 15/09/2017 1:23 PM, Mel Gorman wrote:
> > > On Thu, Sep 14, 2017 at 07:49:31PM +0300, Tariq Toukan wrote:
> > > > Insights: Major degradation between #1 and #2, not getting any
> > > > close to linerate! Degradation is fixed between #2 and #3. This is
> > > > because page allocator cannot stand the higher allocation rate. In
> > > > #2, we also see that the addition of rings (cores) reduces BW (!!),
> > > > as result of increasing congestion over shared resources.
> > > > 
> > > 
> > > Unfortunately, no surprises there.
> > > 
> > > > Congestion in this case is very clear. When monitored in perf
> > > > top: 85.58% [kernel] [k] queued_spin_lock_slowpath
> > > > 
> > > 
> > > While it's not proven, the most likely candidate is the zone lock
> > > and that should be confirmed using a call-graph profile. If so, then
> > > the suggestion to tune to the size of the per-cpu allocator would
> > > mitigate the problem.
> > > 
> > Indeed, I tuned the per-cpu allocator and bottleneck is released.
> > 
> 
> Hi all,
> 
> After leaving this task for a while doing other tasks, I got back to it now
> and see that the good behavior I observed earlier was not stable.
> 
> Recall: I work with a modified driver that allocates a page (4K) per packet
> (MTU=1500), in order to simulate the stress on page-allocator in 200Gbps
> NICs.
> 

There is almost new in the data that hasn't been discussed before. The
suggestion to free on a remote per-cpu list would be expensive as it would
require per-cpu lists to have a lock for safe remote access.  However,
I'd be curious if you could test the mm-pagealloc-irqpvec-v1r4 branch
ttps://git.kernel.org/pub/scm/linux/kernel/git/mel/linux.git .  It's an
unfinished prototype I worked on a few weeks ago. I was going to revisit
in about a months time when 4.15-rc1 was out. I'd be interested in seeing
if it has a postive gain in normal page allocations without destroying
the performance of interrupt and softirq allocation contexts. The
interrupt/softirq context testing is crucial as that is something that
hurt us before when trying to improve page allocator performance.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>