I've found funny live-lock between raid10 barriers during resync and memory controller hard limits. Inside mpage_readpages() task holds on its plug bio which blocks barrier in raid10. Its memory cgroup have no free memory thus task goes into reclaimer but all reclaimable pages are dirty and cannot be written because raid10 is rebuilding and stuck on barrier. Common flush of such IO in schedule() never happens because machine where that happened has a lot of free cpus and task never goes sleep. Lock is 'live' because changing memory limit or killing tasks which holds that stuck bio unblock whole progress. That was happened in 3.18.x but I see no difference in upstream logic. Theoretically this might happen even without memory cgroup. Signed-off-by: Konstantin Khlebnikov <khlebnikov@xxxxxxxxxxxxxx> --- fs/fs-writeback.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c index 56c8fda436c0..ed58863cdb5d 100644 --- a/fs/fs-writeback.c +++ b/fs/fs-writeback.c @@ -1948,6 +1948,12 @@ void wakeup_flusher_threads(long nr_pages, enum wb_reason reason) { struct backing_dev_info *bdi; + /* + * If we are expecting writeback progress we must submit plugged IO. + */ + if (blk_needs_flush_plug(current)) + blk_schedule_flush_plug(current); + if (!nr_pages) nr_pages = get_nr_dirty_pages(); -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a>