On Tue, Dec 04, 2018 at 03:47:46PM -0700, Jens Axboe wrote: > If we attempt a direct issue to a SCSI device, and it returns BUSY, then > we queue the request up normally. However, the SCSI layer may have > already setup SG tables etc for this particular command. If we later > merge with this request, then the old tables are no longer valid. Once > we issue the IO, we only read/write the original part of the request, > not the new state of it. > > This causes data corruption, and is most often noticed with the file > system complaining about the just read data being invalid: > > [ 235.934465] EXT4-fs error (device sda1): ext4_iget:4831: inode #7142: comm dpkg-query: bad extra_isize 24937 (inode size 256) > > because most of it is garbage... > > This doesn't happen from the normal issue path, as we will simply defer > the request to the hardware queue dispatch list if we fail. Once it's on > the dispatch list, we never merge with it. > > Fix this from the direct issue path by flagging the request as > REQ_NOMERGE so we don't change the size of it before issue. > > See also: > https://bugzilla.kernel.org/show_bug.cgi?id=201685 > > Fixes: 6ce3dd6eec1 ("blk-mq: issue directly if hw queue isn't busy in case of 'none'") > Signed-off-by: Jens Axboe <axboe@xxxxxxxxx> Tested-by: Guenter Roeck <linux@xxxxxxxxxxxx> ... on two systems affected by the problem. > > diff --git a/block/blk-mq.c b/block/blk-mq.c > index 3f91c6e5b17a..d8f518c6ea38 100644 > --- a/block/blk-mq.c > +++ b/block/blk-mq.c > @@ -1715,6 +1715,15 @@ static blk_status_t __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx, > break; > case BLK_STS_RESOURCE: > case BLK_STS_DEV_RESOURCE: > + /* > + * If direct dispatch fails, we cannot allow any merging on > + * this IO. Drivers (like SCSI) may have set up permanent state > + * for this request, like SG tables and mappings, and if we > + * merge to it later on then we'll still only do IO to the > + * original part. > + */ > + rq->cmd_flags |= REQ_NOMERGE; > + > blk_mq_update_dispatch_busy(hctx, true); > __blk_mq_requeue_request(rq); > break;