On 31 Jul 2019, at 22:17, Dave Chinner wrote: > From: Dave Chinner <dchinner@xxxxxxxxxx> > > Running metadata intensive workloads, I've been seeing the AIL > pushing getting stuck on pinned buffers and triggering log forces. > The log force is taking a long time to run because the log IO is > getting throttled by wbt_wait() - the block layer writeback > throttle. It's being throttled because there is a huge amount of > metadata writeback going on which is filling the request queue. > > IOWs, we have a priority inversion problem here. > > Mark the log IO bios with REQ_IDLE so they don't get throttled > by the block layer writeback throttle. When we are forcing the CIL, > we are likely to need to to tens of log IOs, and they are issued as > fast as they can be build and IO completed. Hence REQ_IDLE is > appropriate - it's an indication that more IO will follow shortly. > > And because we also set REQ_SYNC, the writeback throttle will no > treat log IO the same way it treats direct IO writes - it will not > throttle them at all. Hence we solve the priority inversion problem > caused by the writeback throttle being unable to distinguish between > high priority log IO and background metadata writeback. > [ cc Jens ] We spent a lot of time getting rid of these inversions in io.latency (and the new io.cost), where REQ_META just blows through the throttling and goes into back charging instead. It feels awkward to have one set of prio inversion workarounds for io.* and another for wbt. Jens, should we make an explicit one that doesn't rely on magic side effects, or just decide that metadata is meta enough to break all the rules? -chris