On Fri, Nov 19, 2021 at 04:49:34PM +0000, Tim Walker wrote: > > > On 19 Nov 2021 Tim Walker wrote: > > > > > > >On Thu, 18 Nov 2021 23:30:41 +0800, Ming Lei wrote: > >> We never insert flush request into scheduler queue before. > >> > >> Recently commit d92ca9d8348f ("blk-mq: don't handle non-flush requests in > >> blk_insert_flush") tries to handle FUA data request as normal request. > >> This way has caused warning[1] in mq-deadline dd_exit_sched() or io hang in > >> case of kyber since RQF_ELVPRIV isn't set for flush request, then > >> ->finish_request won't be called. > >> > >> [...] > > > >Applied, thanks! > > > >[1/1] blk-mq: don't insert FUA request with data into scheduler queue > > commit: 2b504bd4841bccbf3eb83c1fec229b65956ad8ad > > > >Best regards, > >-- > >Jens Axboe > > > > > > > > I know the discussion is over, > This thread is just for fixing one recent regression caused by queuing FUA data into scheduler queue, and actually direct dispach has been done for very long time, but I don't mean it is reasonable. > but I can't figure out why we treat FUA as a flush. A FUA write only > applies to the command at hand, and is not required to flush any previous > command's data from the device's volatile write cache. Similarly for a > read request - servicing a read from media is really more the rule than > the exception (lots of workload dependencies here...), so why would a > FUA read bypass the scheduler? Is there linux kernel FUA read users? Just run a quick grep, seems FUA is just used for sync write. > The device is always free to service any request from media or cache ( > as long as it follows the applicable volatile write and read cache settings), > so normally we don't know how it is treating the request, so it doesn't seem > to matter. > > Consider a FUA write: Why does the fact that we intend that write to bypass > the device volatile write cache mean it should bypass the scheduler? All the > other traffic-shaping algorithms that help effectively schedule writes are > still applicable. E.g. we know we can delay/coalesce them a bit to allow > reads to be prioritized, but I can't figure out why we would fast-track a > FUA write. Isn't the value to the system for scheduling still valid, even > though we are forcing the data to go to media? It shouldn't be hard to to queue FUA into scheduler, but details need to consider, such as, if FUA can be merged with normal IO, maybe others. Also do you have any test or benchmark result to support the change of queuing FUA to scheduler? Thanks, Ming