On 2020/8/4 16:14, Ming Lei wrote: > On Tue, Aug 04, 2020 at 03:50:32PM +0800, Coly Li wrote: >> On 2020/8/4 14:01, Ming Lei wrote: >>> On Tue, Aug 04, 2020 at 12:29:07PM +0800, Coly Li wrote: >>>> On 2020/8/4 12:15, Ming Lei wrote: >>>>> In case of block device backend, if the backend supports discard, the >>>>> loop device will set queue flag of QUEUE_FLAG_DISCARD. >>>>> >>>>> However, limits.discard_granularity isn't setup, and this way is wrong, >>>>> see the following description in Documentation/ABI/testing/sysfs-block: >>>>> >>>>> A discard_granularity of 0 means that the device does not support >>>>> discard functionality. >>>>> >>>>> Especially 9b15d109a6b2 ("block: improve discard bio alignment in >>>>> __blkdev_issue_discard()") starts to take q->limits.discard_granularity >>>>> for computing max discard sectors. And zero discard granularity causes >>>>> kernel oops[1]. >>>>> >>>>> Fix the issue by set up discard granularity and alignment. >>>>> [snipped] >>>> >>>> Hi Ming, >>>> >>>> I did similar change, it can avoid the panic or 0 length discard bio. >>>> But yesterday I realize the discard request to loop device should not go >>>> into __blkdev_issue_discard(). As commit c52abf563049 ("loop: Better >>>> discard support for block devices") mentioned it should go into >>>> blkdev_issue_zeroout(), this is why in loop_config_discard() the >>>> max_discard_sectors is set to backingq->limits.max_write_zeroes_sectors. >>> >>> That commit meant REQ_OP_DISCARD on a loop device is translated into >>> blkdev_issue_zeroout(), because REQ_OP_DISCARD is handled as >>> file->f_op->fallocate(FALLOC_FL_PUNCH_HOLE), which will cause >>> "subsequent reads from this range will return zeroes". >>> >>>> >>>> Now I am looking at the problem why discard request on loop device >>>> doesn't go into blkdev_issue_zeroout(). >>> >>> No, that is correct behavior, since loop can support discard or zeroout. >>> >>> If QUEUE_FLAG_DISCARD is set, either discard_granularity or max discard >>> sectors shouldn't be zero. >>> >>> This patch shouldn't set discard_granularity if limits.max_write_zeroes_sectors >>> is zero, will fix it in V2. >>> >>>> >>>> With the above change, the discard is very slow on loop device with >>>> backing device. In my testing, mkfs.xfs on /dev/loop0 does not complete >>>> in 20 minutes, each discard request is only 4097 sectors. >>> >>> I'd suggest you to check the discard related queue limits, and see why >>> each discard request just sends 4097 sectors. >>> >>> Or we need to mirror underlying queue's discard_granularity too? Can you >>> try the following patch? >> >> Can I know the exact command line to reproduce the panic ? > > modprobe scsi_debug delay=0 dev_size_mb=2048 max_queue=1 > > losetup -f /dev/sdc --direct-io=on #suppose /dev/sdc is the scsi_debug LUN > > mkfs.ext4 /dev/loop0 #suppose loop0 is the loop backed by /dev/sdc > > mount /dev/loop0 /mnt > > cd /mnt > dbench -t 20 -s 64 > Thanks for the information. With the above command lines, I also find a special case that I can see a 0 byte request triggers a similar BUG() panic --- when the discard LBA is 0, and loop device driver queue->limits.discard_granularity is 0. > >> >> I try to use blkdiscard, or dbench -D /mount_point_to_loop0, and see all >> the discard requests go into blkdev_fallocate()=>blkdev_issue_zeroout(). >> No request goes into __blkdev_issue_discard(). > > The issue isn't related with backend queue, and it is triggered on > loop's request. Yes it is clear to me now. Beside the loop device driver, __blkdev_issue_discard() should also be fixed to tolerate the buggy queue->limits.discard_granularity, in case some other driver does similar mistake, or a unlucky downstream kernel maintainer missing your loop device driver fix. Yes, please post the v2 patch, now I can help to review this patch. Thanks for your hint :-) Coly Li