On Sun 13-03-16 23:22:23, Tetsuo Handa wrote: [...] I am not familiar with the writeback code so I might be missing something essential here but why are we even queueing more and more work without checking there has been enough already scheduled or in progress. Something as simple as: diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c index 6915c950e6e8..aa52e23ac280 100644 --- a/fs/fs-writeback.c +++ b/fs/fs-writeback.c @@ -887,7 +887,7 @@ void wb_start_writeback(struct bdi_writeback *wb, long nr_pages, { struct wb_writeback_work *work; - if (!wb_has_dirty_io(wb)) + if (!wb_has_dirty_io(wb) || writeback_in_progress(wb)) return; /* > diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c > index 5c46ed9..21450c7 100644 > --- a/fs/fs-writeback.c > +++ b/fs/fs-writeback.c > @@ -929,7 +929,8 @@ void wb_start_writeback(struct bdi_writeback *wb, long nr_pages, > * This is WB_SYNC_NONE writeback, so if allocation fails just > * wakeup the thread for old dirty data writeback > */ > - work = kzalloc(sizeof(*work), GFP_ATOMIC); > + work = kzalloc(sizeof(*work), > + GFP_NOWAIT | __GFP_NOMEMALLOC | __GFP_NOWARN); Well, I guess you are right that this doesn't sound like a context which really needs access to memory reserves and GFP_ATOMIC would more used for what can be achieved by GFP_NOWAIT now. Using __GFP_NOMEMALLOC would be needed regardless as you pointed out already because this might be called from the page reclaim context. So if the above simple hack or other explicit limit cannot be done then __GFP_NOMEMALLOC is an absolute minimum. > if (!work) { > trace_writeback_nowork(wb); > wb_wakeup(wb); > -- > 1.8.3.1 -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html