Re: [PATCH] blk-mq: fix corruption with direct issue

Jens Axboe <axboe@xxxxxxxxx> · Thu, 6 Dec 2018 20:04:38 -0700

On 12/6/18 7:46 PM, Theodore Y. Ts'o wrote:
> On Wed, Dec 05, 2018 at 11:03:01AM +0800, Ming Lei wrote:
>>
>> But at that time, there isn't io scheduler for MQ, so in theory the
>> issue should be there since v4.11, especially 945ffb60c11d ("mq-deadline:
>> add blk-mq adaptation of the deadline IO scheduler").
> 
> Hi Ming,
> 
> How were serious you about this issue being there (theoretically) an
> issue since 4.11?  Can you talk about how it might get triggered, and
> how we can test for it?  The reason why I ask is because we're trying
> to track down a mysterious file system corruption problem on a 4.14.x
> stable kernel.  The symptoms are *very* eerily similar to kernel
> bugzilla #201685.
> 
> The problem is that the problem is super-rare --- roughly once a week
> out of a popuation of about 2500 systems.  The workload is NFS
> serving.  Unfortunately, the problem is since 4.14.63, we can no
> longer disable blk-mq for the virtio-scsi driver, thanks to the commit
> b5b6e8c8d3b4 ("scsi: virtio_scsi: fix IO hang caused by automatic irq
> vector affinity") getting backported into 4.14.63 as commit
> 70b522f163bbb32.
> 
> We're considering reverting this patch in our 4.14 LTS kernel, and
> seeing whether it makes the problem go away.  Is there any thing else
> you might suggest?

We should just make SCSI do the right thing, which is to unprep if
it sees BUSY and prep next time again. Otherwise I fear the direct
dispatch isn't going to be super useful, if a failed direct dispatch
prevents future merging.

This would be a lot less error prone as well for other cases.

-- 
Jens Axboe