Re: isolate_freepages_block and excessive CPU usage by OSD process

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Dec 02, 2014 at 12:47:24PM +1100, Christian Marie wrote:
> On 28.11.2014 9:03, Joonsoo Kim wrote:
> > Hello,
> >
> > I didn't follow-up this discussion, but, at glance, this excessive CPU
> > usage by compaction is related to following fixes.
> >
> > Could you test following two patches?
> >
> > If these fixes your problem, I will resumit patches with proper commit
> > description.
> >
> > -------- 8< ---------
> 
> 
> Thanks for looking into this. Running 3.18-rc5 kernel with your patches has
> produced some interesting results.
> 
> Load average still spikes to around 2000-3000 with the processors spinning 100%
> doing compaction related things when min_free_kbytes is left at the default.
> 
> However, unlike before, the system is now completely stable. Pre-patch it would
> be almost completely unresponsive (having to wait 30 seconds to establish an
> SSH connection and several seconds to send a character).
> 
> Is it reasonable to guess that ipoib is giving compaction a hard time and
> fixing this bug has allowed the system to at least not lock up?
> 
> I will try back-porting this to 3.10 and seeing if it is stable under these
> strange conditions also.

Hello,

Good to hear!
Load average spike may be related to skip bit management. Currently, there is
no way to maintain skip bit permanently. So, after one iteration of compaction
is finished and skip bit is reset, all pageblocks should be re-scanned.

Your system has mellanox driver and although I don't know exactly what it is,
I heard that it allocates enormous pages and do get_user_pages() to
pin pages in memory. These memory aren't available to compaction, but,
compaction always scan it.

This is just my assumption, so if possible, please check it with
compaction tracepoint. If it is, we can make a solution for this
problem.

Anyway, could you test one more time without second patch?
IMO, first patch is reasonable to backport, because it fixes a real bug.
But, I'm not sure if second patch is needed to backport or not.
One more testing will help us to understand the effect of patch.

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [ECOS]     [Asterisk Internet PBX]     [Linux API]