Re: [PATCH 4/5] vmscan: Have kswapd sleep for a short interval and double check it should be asleep

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> On Tue, Nov 17, 2009 at 09:18:11PM +0900, KOSAKI Motohiro wrote:
> > > On Tue, Nov 17, 2009 at 08:03:21PM +0900, KOSAKI Motohiro wrote:
> > > > I'm sorry for the long delay.
> > > > 
> > > > > On Sat, Nov 14, 2009 at 06:34:23PM +0900, KOSAKI Motohiro wrote:
> > > > > > 2009/11/14 Mel Gorman <mel@xxxxxxxxx>:
> > > > > > > On Sat, Nov 14, 2009 at 03:00:57AM +0900, KOSAKI Motohiro wrote:
> > > > > > >> > On Fri, Nov 13, 2009 at 07:43:09PM +0900, KOSAKI Motohiro wrote:
> > > > > > >> > > > After kswapd balances all zones in a pgdat, it goes to sleep. In the event
> > > > > > >> > > > of no IO congestion, kswapd can go to sleep very shortly after the high
> > > > > > >> > > > watermark was reached. If there are a constant stream of allocations from
> > > > > > >> > > > parallel processes, it can mean that kswapd went to sleep too quickly and
> > > > > > >> > > > the high watermark is not being maintained for sufficient length time.
> > > > > > >> > > >
> > > > > > >> > > > This patch makes kswapd go to sleep as a two-stage process. It first
> > > > > > >> > > > tries to sleep for HZ/10. If it is woken up by another process or the
> > > > > > >> > > > high watermark is no longer met, it's considered a premature sleep and
> > > > > > >> > > > kswapd continues work. Otherwise it goes fully to sleep.
> > > > > > >> > > >
> > > > > > >> > > > This adds more counters to distinguish between fast and slow breaches of
> > > > > > >> > > > watermarks. A "fast" premature sleep is one where the low watermark was
> > > > > > >> > > > hit in a very short time after kswapd going to sleep. A "slow" premature
> > > > > > >> > > > sleep indicates that the high watermark was breached after a very short
> > > > > > >> > > > interval.
> > > > > > >> > > >
> > > > > > >> > > > Signed-off-by: Mel Gorman <mel@xxxxxxxxx>
> > > > > > >> > >
> > > > > > >> > > Why do you submit this patch to mainline? this is debugging patch
> > > > > > >> > > no more and no less.
> > > > > > >> > >
> > > > > > >> >
> > > > > > >> > Do you mean the stats part? The stats are included until such time as the page
> > > > > > >> > allocator failure reports stop or are significantly reduced. In the event a
> > > > > > >> > report is received, the value of the counters help determine if kswapd was
> > > > > > >> > struggling or not. They should be removed once this mess is ironed out.
> > > > > > >> >
> > > > > > >> > If there is a preference, I can split out the stats part and send it to
> > > > > > >> > people with page allocator failure reports for retesting.
> > > > > > >>
> > > > > > >> I'm sorry my last mail didn't have enough explanation.
> > > > > > >> This stats help to solve this issue. I agreed. but after solving this issue,
> > > > > > >> I don't imagine administrator how to use this stats. if KSWAPD_PREMATURE_FAST or
> > > > > > >> KSWAPD_PREMATURE_SLOW significantly increased, what should admin do?
> > > > > > >
> > > > > > > One possible workaround would be to raise min_free_kbytes while a fix is
> > > > > > > being worked on.
> > > > > > 
> > > > > > Please correct me, if I said wrong thing.
> > > > > 
> > > > > You didn't.
> > > > > 
> > > > > > if I was admin, I don't watch this stats because kswapd frequently
> > > > > > wakeup doesn't mean any trouble. instead I watch number of allocation
> > > > > > failure.
> > > > > 
> > > > > The stats are not tracking when kswapd wakes up. It helps track how
> > > > > quickly the high or low watermarks are going under once kswapd tries to
> > > > > go back to sleep.
> > > > 
> > > > Umm, honestly I'm still puzlled. probably we need go back one step at once.
> > > > kswapd wake up when memory amount less than low watermark and sleep
> > > > when memory amount much than high watermask. We need to know 
> > > > GFP_ATOMIC failure sign.
> > > > 
> > > > My point is, kswapd wakeup only happen after kswapd sleeping. but if the system is
> > > > under heavy pressure and memory amount go up and down between low watermark
> > > > and high watermark, this stats don't increase at all. IOW, this stats is strong related to
> > > > high watermark.
> > > > 
> > > 
> > > Yes, this is true but as long as kswapd is awake and doing its job, it
> > > will continue taking direction on what order it should be reclaiming from
> > > processes that failed the low_watermark test.  The GFP_ATOMIC allocations
> > > will be allowed to go under this low watermark but will have informed kswapd
> > > what order it should be reclaiming at so it stays working.
> > > 
> > > A stat that increases between the low and high watermark would indicate
> > > that memory pressure is there or that the reclaim algorithm is not
> > > working as expected but that's checking for a different problem.
> > > 
> > > What I was looking at was  kswapd going to sleep and the low or min watermarks
> > > being hit very quickly after that so that kswapd pre-emptively kicks in
> > > before allocations start failing again.
> > > 
> > > > Probaby, min watermark or low watermark are more useful for us.
> > > 




> > > Why? kswapd is awake between those points.
> > 
> > What's difference below (1) and (2)?
> > 
> > 1. kswapd() run 100ms and sleep 10ms and run 100ms.
> > 2. kswapd() run 200ms
> 
> Because prior to the patch, once kswapd went to sleep, it would
> not wake again until the low watermark was reached. There appeared
> to be a timing issue where congestion_wait() would block kswapd
> just long enough before checking the high watermark to mean that
> it stayed awake. This wasn't happening hence the approach of
> briefly-sleep-after-high-watermark-is-reached-and-double-check-watermarks-are-ok.

You are right.

> > (1) represent amount memory go beyond high-watermark very shortly and go below
> > low-watermark right after . (2) represent amount memory neared high-watermark closely, but don't touched,
> > and probably go blow low-watermark right after.
> > It's almost same memory pressure. but (1) increase KSWAPD_HIGH_WMARK_HIT_QUICKLY and
> > (2) don't increase any stat.
> > 
> > Thus, We can't think KSWAPD_HIGH_WMARK_HIT_QUICKLY indicate memory pressure.
> > 
> 
> It indicates mild memory pressure because the high watermark was only
> reached for a short period of time.
> 
> > > > # of called wake_all_kswapd() is related to low watermark. and It's conteniously
> > > > increase although the system have strong memroy pressure. I'm ok.
> > > > KSWAPD_NO_CONGESTION_WAIT is related to min watermark. I'm ok too..
> > > > # of page allocation failure is related to  min watermark too. I'm ok too.
> > > > 
> > > > IOW, I only dislike this stat stop increase strong memory pressure (above explanation).
> > > > Can you please tell me why you think kswapd slept time is so important?
> > > 
> > > I don't think the amount of time it has slept is important. I think it's
> > > important to know if the system is getting back into watermark trouble very
> > > shortly after kswapd reached the high watermark.
> > 
> > Probably, My last mail doesn't take kindly explanation.
> > My point is, beyond high-watermark or not doesn't indicate any meaningful 
> > phenomenon.
> > 
> > Then, I'd prefer low or min-watermark related stats.
> 
> What would that have to do with the kswapd-briefly-sleep logic? i.e. the
> stats you are suggesting, while meaningful, are for looking at a
> different type of problem. If you want those stats, then I should revert
> the stats part of this patch altogether. Does that then mean you also
> want the patch that makes kswapd double check watermarks to be dropped
> or just have no associated stats to see what it's doing in the event of
> allocation failure?

Yes, you talked right thing.

After awhile thinking (and one night good sleeping), I conclude I should not enforce
my personal preference any more. I'm sorry. 

Honestly, I haven't understand this stats usage. but the stats is not core
piece in this patch concept.  we sholdn't get stuck small issue.

then, I'll review your latest patch soon.

Thanks.




--
To unsubscribe from this list: send the line "unsubscribe kernel-testers" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux