On Mon, Sep 13, 2010 at 5:55 PM, Mel Gorman <mel@xxxxxxxxx> wrote: > On Mon, Sep 13, 2010 at 12:37:44AM +0900, Minchan Kim wrote: >> > > > > > <SNIP> >> > > > > > >> > > > > > + * in sleeping but cond_resched() is called in case the current process has >> > > > > > + * consumed its CPU quota. >> > > > > > + */ >> > > > > > +long wait_iff_congested(struct zone *zone, int sync, long timeout) >> > > > > > +{ >> > > > > > + long ret; >> > > > > > + unsigned long start = jiffies; >> > > > > > + DEFINE_WAIT(wait); >> > > > > > + wait_queue_head_t *wqh = &congestion_wqh[sync]; >> > > > > > + >> > > > > > + /* >> > > > > > + * If there is no congestion, check the amount of writeback. If there >> > > > > > + * is no significant writeback and no congestion, just cond_resched >> > > > > > + */ >> > > > > > + if (atomic_read(&nr_bdi_congested[sync]) == 0) { >> > > > > > + unsigned long inactive, writeback; >> > > > > > + >> > > > > > + inactive = zone_page_state(zone, NR_INACTIVE_FILE) + >> > > > > > + zone_page_state(zone, NR_INACTIVE_ANON); >> > > > > > + writeback = zone_page_state(zone, NR_WRITEBACK); >> > > > > > + >> > > > > > + /* >> > > > > > + * If less than half the inactive list is being written back, >> > > > > > + * reclaim might as well continue >> > > > > > + */ >> > > > > > + if (writeback < inactive / 2) { >> > > > > >> > > > > I am not sure this is best. >> > > > > >> > > > >> > > > I'm not saying it is. The objective is to identify a situation where >> > > > sleeping until the next write or congestion clears is pointless. We have >> > > > already identified that we are not congested so the question is "are we >> > > > writing a lot at the moment?". The assumption is that if there is a lot >> > > > of writing going on, we might as well sleep until one completes rather >> > > > than reclaiming more. >> > > > >> > > > This is the first effort at identifying pointless sleeps. Better ones >> > > > might be identified in the future but that shouldn't stop us making a >> > > > semi-sensible decision now. >> > > >> > > nr_bdi_congested is no problem since we have used it for a long time. >> > > But you added new rule about writeback. >> > > >> > >> > Yes, I'm trying to add a new rule about throttling in the page allocator >> > and from vmscan. As you can see from the results in the leader, we are >> > currently sleeping more than we need to. >> >> I can see the about avoiding congestion_wait but can't find about >> (writeback < incative / 2) hueristic result. >> > > See the leader and each of the report sections entitled > "FTrace Reclaim Statistics: congestion_wait". It provides a measure of > how sleep times are affected. > > "congest waited" are waits due to calling congestion_wait. "conditional waited" > are those related to wait_iff_congested(). As you will see from the reports, > sleep times are reduced overall while callers of wait_iff_congested() still > go to sleep. The reports entitled "FTrace Reclaim Statistics: vmscan" show > how reclaim is behaving and indicators so far are that reclaim is not hurt > by introducing wait_iff_congested(). I saw the result. It was a result about effectiveness _both_ nr_bdi_congested and (writeback < inactive/2). What I mean is just effectiveness (writeback < inactive/2) _alone_. If we remove (writeback < inactive / 2) check and unconditionally return, how does the behavior changed? Am I misunderstanding your report in leader? -- Kind regards, Minchan Kim -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html