Re: single aio thread is migrated crazily by scheduler

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Nov 21, 2019 at 02:29:37PM +0100 Peter Zijlstra wrote:
> On Wed, Nov 20, 2019 at 05:03:13PM -0500, Phil Auld wrote:
> > On Wed, Nov 20, 2019 at 08:16:36PM +0100 Peter Zijlstra wrote:
> > > On Tue, Nov 19, 2019 at 07:40:54AM +1100, Dave Chinner wrote:
> 
> > > > Yes, that's precisely the problem - work is queued, by default, on a
> > > > specific CPU and it will wait for a kworker that is pinned to that
> > > 
> > > I'm thinking the problem is that it doesn't wait. If it went and waited
> > > for it, active balance wouldn't be needed, that only works on active
> > > tasks.
> > 
> > Since this is AIO I wonder if it should queue_work on a nearby cpu by 
> > default instead of unbound.  
> 
> The thing seems to be that 'unbound' is in fact 'bound'. Maybe we should
> fix that. If the load-balancer were allowed to move the kworker around
> when it didn't get time to run, that would probably be a better
> solution.
> 

Yeah, I'm not convinced this is actually a scheduler issue.


> Picking another 'bound' cpu by random might create the same sort of
> problems in more complicated scenarios.
> 
> TJ, ISTR there used to be actually unbound kworkers, what happened to
> those? or am I misremembering things.
> 
> > > Lastly,
> > > one other thing to try is -next. Vincent reworked the load-balancer
> > > quite a bit.
> > > 
> > 
> > I've tried it with the lb patch series. I get basically the same results.
> > With the high granularity settings I get 3700 migrations for the 30 
> > second run at 4k. Of those about 3200 are active balance on stock 5.4-rc7.
> > With the lb patches it's 3500 and 3000, a slight drop. 
> 
> Thanks for testing that. I didn't expect miracles, but it is good to
> verify.
> 
> > Using the default granularity settings 50 and 22 for stock and 250 and 25.
> > So a few more total migrations with the lb patches but about the same active.
> 
> Right, so the granularity thing interacts with the load-balance period.
> By pushing it up, as some people appear to do, makes it so that what
> might be a temporal imablance is perceived as a persitent imbalance.
> 
> Tying the load-balance period to the gramularity is something we could
> consider, but then I'm sure, we'll get other people complaining the
> doesn't balance quick enough anymore.
> 

Thanks. These are old tuned settings that have been carried along. They may
not be right for newer kernels anyway. 


-- 





[Index of Archives]     [XFS Filesystem Development (older mail)]     [Linux Filesystem Development]     [Linux Audio Users]     [Yosemite Trails]     [Linux Kernel]     [Linux RAID]     [Linux SCSI]


  Powered by Linux