On Wed, Sep 22, 2021 at 04:04:47PM +1000, Dave Chinner wrote: > On Tue, Sep 21, 2021 at 11:58:31AM +0100, Mel Gorman wrote: > > On Tue, Sep 21, 2021 at 10:13:17AM +1000, NeilBrown wrote: > > > On Mon, 20 Sep 2021, Mel Gorman wrote: > > > > -long wait_iff_congested(int sync, long timeout) > > > > -{ > > > > - long ret; > > > > - unsigned long start = jiffies; > > > > - DEFINE_WAIT(wait); > > > > - wait_queue_head_t *wqh = &congestion_wqh[sync]; > > > > - > > > > - /* > > > > - * If there is no congestion, yield if necessary instead > > > > - * of sleeping on the congestion queue > > > > - */ > > > > - if (atomic_read(&nr_wb_congested[sync]) == 0) { > > > > - cond_resched(); > > > > - > > > > - /* In case we scheduled, work out time remaining */ > > > > - ret = timeout - (jiffies - start); > > > > - if (ret < 0) > > > > - ret = 0; > > > > - > > > > - goto out; > > > > - } > > > > - > > > > - /* Sleep until uncongested or a write happens */ > > > > - prepare_to_wait(wqh, &wait, TASK_UNINTERRUPTIBLE); > > > > > > Uninterruptible wait. > > > > > > .... > > > > +static void > > > > +reclaim_throttle(pg_data_t *pgdat, enum vmscan_throttle_state reason, > > > > + long timeout) > > > > +{ > > > > + wait_queue_head_t *wqh = &pgdat->reclaim_wait; > > > > + unsigned long start = jiffies; > > > > + long ret; > > > > + DEFINE_WAIT(wait); > > > > + > > > > + atomic_inc(&pgdat->nr_reclaim_throttled); > > > > + WRITE_ONCE(pgdat->nr_reclaim_start, > > > > + node_page_state(pgdat, NR_THROTTLED_WRITTEN)); > > > > + > > > > + prepare_to_wait(wqh, &wait, TASK_INTERRUPTIBLE); > > > > > > Interruptible wait. > > > > > > Why the change? I think these waits really need to be TASK_UNINTERRUPTIBLE. > > > > > > > Because from mm/ context, I saw no reason why the task *should* be > > uninterruptible. It's waiting on other tasks to complete IO and it is not > > protecting device state, filesystem state or anything else. If it gets > > a signal, it's safe to wake up, particularly if that signal is KILL and > > the context is a direct reclaimer. > > I disagree. whether the sleep should be interruptable or > not is entirely dependent on whether the caller can handle failure > or not. If this is GFP_NOFAIL, allocation must not fail no matter > what the context is, so signals and the like are irrelevant. > > For a context that can handle allocation failure, then it makes > sense to wake on events that will result in the allocation failing > immediately. But if all this does is make the allocation code go > around another retry loop sooner, then an interruptible sleep still > doesn't make any sense at all here... > Ok, between this and Neil's mail on the same topic, I'm convinced. -- Mel Gorman SUSE Labs