On 1/30/19 2:01 AM, Jianchao Wang wrote: > Florian reported a io hung issue when fsync(). It should be > triggered by following race condition. > > data + post flush a flush > > blk_flush_complete_seq > case REQ_FSEQ_DATA > blk_flush_queue_rq > issued to driver blk_mq_dispatch_rq_list > try to issue a flush req > failed due to NON-NCQ command > .queue_rq return BLK_STS_DEV_RESOURCE > > request completion > req->end_io // doesn't check RESTART > mq_flush_data_end_io > case REQ_FSEQ_POSTFLUSH > blk_kick_flush > do nothing because previous flush > has not been completed > blk_mq_run_hw_queue > insert rq to hctx->dispatch > due to RESTART is still set, do nothing > > To fix this, replace the blk_mq_run_hw_queue in mq_flush_data_end_io > with blk_mq_sched_restart to check and clear the RESTART flag. Applied, thanks. -- Jens Axboe