On Fri, Jun 10 2022 at 12:26P -0400, Mike Snitzer <snitzer@xxxxxxxxxx> wrote: > On Thu, Jun 09 2022 at 7:43P -0400, > Shin'ichiro Kawasaki <shinichiro.kawasaki@xxxxxxx> wrote: > > > After the commit ca522482e3ea ("dm: pass NULL bdev to bio_alloc_clone"), > > bdev is no longer set to clone bio for ->map function. Instead, each DM > > targets shall set bdev to the clone bio by calling bio_set_dev() before > > issuing IO. Also the commit ensured that dm_zone_endio() is called from > > clone_endio() only when DM targets set bdev to the clone bio. > > > > However, crypt_map() of dm-crypt does not call bio_set_dev() for every > > clone bio. Then dm_zone_endio() is not called at completion of the bios > > and zone locks are not properly unlocked. This triggers a hang when > > blktests block/004 is run for dm-crypt on zoned block devices [1]. To > > avoid the hang, call bio_set_dev() for every bio in crypt_map(). > > > > [1] > > > > [ 6596.702977][T55017] run blktests block/004 at 2022-06-07 20:18:01 > > <snip> > > Please refrain from putting stack traces in patch headers whenever > possible. Really no need for this, especially given how long this one > is! > > I revised the header as follows: > https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-5.19&id=cae0053631cd4b02bb882b53c7da20652b038049 > > > Fixes: ca522482e3ea ("dm: pass NULL bdev to bio_alloc_clone") > > Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@xxxxxxx> > > --- > > drivers/md/dm-crypt.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c > > index 159c6806c19b..c68523a89428 100644 > > --- a/drivers/md/dm-crypt.c > > +++ b/drivers/md/dm-crypt.c > > @@ -3378,6 +3378,8 @@ static int crypt_map(struct dm_target *ti, struct bio *bio) > > struct dm_crypt_io *io; > > struct crypt_config *cc = ti->private; > > > > + bio_set_dev(bio, cc->dev->bdev); > > + > > /* > > * If bio is REQ_PREFLUSH or REQ_OP_DISCARD, just bypass crypt queues. > > * - for REQ_PREFLUSH device-mapper core ensures that no IO is in-flight > > @@ -3385,7 +3387,6 @@ static int crypt_map(struct dm_target *ti, struct bio *bio) > > */ > > if (unlikely(bio->bi_opf & REQ_PREFLUSH || > > bio_op(bio) == REQ_OP_DISCARD)) { > > - bio_set_dev(bio, cc->dev->bdev); > > if (bio_sectors(bio)) > > bio->bi_iter.bi_sector = cc->start + > > dm_target_offset(ti, bio->bi_iter.bi_sector); > > -- > > 2.36.1 > > > > BUT something isn't quite adding up with why you need this change > given commit ca522482e3ea has this compatibility code: > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c > index 9650ba2075b8..d62f1354ecbf 100644 > --- a/drivers/md/dm.c > +++ b/drivers/md/dm.c > @@ -581,7 +581,9 @@ static struct dm_io *alloc_io(struct mapped_device *md, struct bio *bio) > struct dm_target_io *tio; > struct bio *clone; > > - clone = bio_alloc_clone(bio->bi_bdev, bio, GFP_NOIO, &md->io_bs); > + clone = bio_alloc_clone(NULL, bio, GFP_NOIO, &md->io_bs); > + /* Set default bdev, but target must bio_set_dev() before issuing IO */ > + clone->bi_bdev = md->disk->part0; > > tio = clone_to_tio(clone); > tio->flags = 0; > > The clone bio passed to crypt_map() _should_ be the same as was passed > before commit ca522482e3ea (It gets set to md->disk->part0 rather than > bio->bi_bdev but they really should point to the same top-level DM > bdev). > > So why is your extra override to have dm-crypt point to its underlying > data device important now? Think the issue is not that the clone->bi_bdev isn't set. It is that it is merely not changed from the md->disk->part0 default. Commit ca522482e3ea added this to clone_endio: if (likely(bio->bi_bdev != md->disk->part0)) { ... if (static_branch_unlikely(&zoned_enabled) && unlikely(blk_queue_is_zoned(q))) dm_zone_endio(io, bio); } So dm_zone_endio() isn't called unless you force dm-crypt (or any other target) to remap the clone bio to a different bdev. It is weird that a bio is completing without having been remapped by dm-crypt; but there could be any number of reasons why that hasn't happened. Anyway, I think a correct fix needs to be to the logic in clone_endio(). Could be that its best to just remove the gating likely() conditional. Can you please test this patch instead of your patch? Thanks, Mike diff --git a/drivers/md/dm.c b/drivers/md/dm.c index 8b21155d3c4f..d8f16183bf27 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -1016,23 +1016,19 @@ static void clone_endio(struct bio *bio) struct dm_io *io = tio->io; struct mapped_device *md = io->md; - if (likely(bio->bi_bdev != md->disk->part0)) { - struct request_queue *q = bdev_get_queue(bio->bi_bdev); - - if (unlikely(error == BLK_STS_TARGET)) { - if (bio_op(bio) == REQ_OP_DISCARD && - !bdev_max_discard_sectors(bio->bi_bdev)) - disable_discard(md); - else if (bio_op(bio) == REQ_OP_WRITE_ZEROES && - !q->limits.max_write_zeroes_sectors) - disable_write_zeroes(md); - } - - if (static_branch_unlikely(&zoned_enabled) && - unlikely(blk_queue_is_zoned(q))) - dm_zone_endio(io, bio); + if (unlikely(error == BLK_STS_TARGET)) { + if (bio_op(bio) == REQ_OP_DISCARD && + !bdev_max_discard_sectors(bio->bi_bdev)) + disable_discard(md); + else if (bio_op(bio) == REQ_OP_WRITE_ZEROES && + !bdev_write_zeroes_sectors(bio->bi_bdev)) + disable_write_zeroes(md); } + if (static_branch_unlikely(&zoned_enabled) && + unlikely(blk_queue_is_zoned(bdev_get_queue(bio->bi_bdev)))) + dm_zone_endio(io, bio); + if (endio) { int r = endio(ti, bio, &error); switch (r) { -- dm-devel mailing list dm-devel@xxxxxxxxxx https://listman.redhat.com/mailman/listinfo/dm-devel