[PATCH RFC] mm, writeback: flush plugged IO in wakeup_flusher_threads()

Konstantin Khlebnikov <khlebnikov@xxxxxxxxxxxxxx> · Thu, 04 Aug 2016 21:36:05 +0300

I've found funny live-lock between raid10 barriers during resync and memory
controller hard limits. Inside mpage_readpages() task holds on its plug bio
which blocks barrier in raid10. Its memory cgroup have no free memory thus
task goes into reclaimer but all reclaimable pages are dirty and cannot be
written because raid10 is rebuilding and stuck on barrier.

Common flush of such IO in schedule() never happens because machine where
that happened has a lot of free cpus and task never goes sleep.

Lock is 'live' because changing memory limit or killing tasks which holds
that stuck bio unblock whole progress.

That was happened in 3.18.x but I see no difference in upstream logic.
Theoretically this might happen even without memory cgroup.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@xxxxxxxxxxxxxx>
---
 fs/fs-writeback.c |    6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 56c8fda436c0..ed58863cdb5d 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -1948,6 +1948,12 @@ void wakeup_flusher_threads(long nr_pages, enum wb_reason reason)
 {
 	struct backing_dev_info *bdi;
 
+	/*
+	 * If we are expecting writeback progress we must submit plugged IO.
+	 */
+	if (blk_needs_flush_plug(current))
+		blk_schedule_flush_plug(current);
+
 	if (!nr_pages)
 		nr_pages = get_nr_dirty_pages();
 

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html