On Thu, Aug 4, 2016 at 10:21 PM, Jens Axboe <axboe@xxxxxxxxx> wrote: > On 08/04/2016 12:36 PM, Konstantin Khlebnikov wrote: >> >> I've found funny live-lock between raid10 barriers during resync and >> memory >> controller hard limits. Inside mpage_readpages() task holds on its plug >> bio >> which blocks barrier in raid10. Its memory cgroup have no free memory thus >> task goes into reclaimer but all reclaimable pages are dirty and cannot be >> written because raid10 is rebuilding and stuck on barrier. >> >> Common flush of such IO in schedule() never happens because machine where >> that happened has a lot of free cpus and task never goes sleep. >> >> Lock is 'live' because changing memory limit or killing tasks which holds >> that stuck bio unblock whole progress. >> >> That was happened in 3.18.x but I see no difference in upstream logic. >> Theoretically this might happen even without memory cgroup. > > > So the issue is that the caller of wakeup_flusher_threads() ends up never > going to sleep, hence the plug is never auto-flushed. I didn't quite > understand your reasoning for why it never sleeps above, but that must be > the gist of it. Ah right, simple context switch doesn't flush plug, so count of cpus is irrelevant. > > I don't see anything inherently wrong with the fix. > > -- > Jens Axboe > > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@xxxxxxxxx. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@xxxxxxxxx"> email@xxxxxxxxx </a> -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html