On Wed, Dec 05, 2018 at 11:03:01AM +0800, Ming Lei wrote: > > But at that time, there isn't io scheduler for MQ, so in theory the > issue should be there since v4.11, especially 945ffb60c11d ("mq-deadline: > add blk-mq adaptation of the deadline IO scheduler"). Hi Ming, How were serious you about this issue being there (theoretically) an issue since 4.11? Can you talk about how it might get triggered, and how we can test for it? The reason why I ask is because we're trying to track down a mysterious file system corruption problem on a 4.14.x stable kernel. The symptoms are *very* eerily similar to kernel bugzilla #201685. The problem is that the problem is super-rare --- roughly once a week out of a popuation of about 2500 systems. The workload is NFS serving. Unfortunately, the problem is since 4.14.63, we can no longer disable blk-mq for the virtio-scsi driver, thanks to the commit b5b6e8c8d3b4 ("scsi: virtio_scsi: fix IO hang caused by automatic irq vector affinity") getting backported into 4.14.63 as commit 70b522f163bbb32. We're considering reverting this patch in our 4.14 LTS kernel, and seeing whether it makes the problem go away. Is there any thing else you might suggest? Thanks, - Ted P.S. Unlike the repro's that users were seeing in #201685, we *did* have an I/O scheduler enabled --- it was mq-deadline. But right now, given your comments, and the corruptions that we're seeing, I'm not feeling very warm and fuzzy about block-mq. :-( :-( :-(