Re: [PATCH RFC] mm, writeback: flush plugged IO in wakeup_flusher_threads()

Konstantin Khlebnikov <koct9i@xxxxxxxxx> · Fri, 5 Aug 2016 08:44:25 +0300

On Thu, Aug 4, 2016 at 10:21 PM, Jens Axboe <axboe@xxxxxxxxx> wrote:
> On 08/04/2016 12:36 PM, Konstantin Khlebnikov wrote:
>>
>> I've found funny live-lock between raid10 barriers during resync and
>> memory
>> controller hard limits. Inside mpage_readpages() task holds on its plug
>> bio
>> which blocks barrier in raid10. Its memory cgroup have no free memory thus
>> task goes into reclaimer but all reclaimable pages are dirty and cannot be
>> written because raid10 is rebuilding and stuck on barrier.
>>
>> Common flush of such IO in schedule() never happens because machine where
>> that happened has a lot of free cpus and task never goes sleep.
>>
>> Lock is 'live' because changing memory limit or killing tasks which holds
>> that stuck bio unblock whole progress.
>>
>> That was happened in 3.18.x but I see no difference in upstream logic.
>> Theoretically this might happen even without memory cgroup.
>
>
> So the issue is that the caller of wakeup_flusher_threads() ends up never
> going to sleep, hence the plug is never auto-flushed. I didn't quite
> understand your reasoning for why it never sleeps above, but that must be
> the gist of it.

Ah right, simple context switch doesn't flush plug, so count of cpus
is irrelevant.

>
> I don't see anything inherently wrong with the fix.
>
> --
> Jens Axboe
>
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html