Re: dm crypt: set bdev to clone bio

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Jun 10, 2022 / 14:56, Mike Snitzer wrote:
> On Fri, Jun 10 2022 at 12:26P -0400,
> Mike Snitzer <snitzer@xxxxxxxxxx> wrote:
> 
> > On Thu, Jun 09 2022 at  7:43P -0400,
> > Shin'ichiro Kawasaki <shinichiro.kawasaki@xxxxxxx> wrote:
> > 
> > > After the commit ca522482e3ea ("dm: pass NULL bdev to bio_alloc_clone"),
> > > bdev is no longer set to clone bio for ->map function. Instead, each DM
> > > targets shall set bdev to the clone bio by calling bio_set_dev() before
> > > issuing IO. Also the commit ensured that dm_zone_endio() is called from
> > > clone_endio() only when DM targets set bdev to the clone bio.
> > > 
> > > However, crypt_map() of dm-crypt does not call bio_set_dev() for every
> > > clone bio. Then dm_zone_endio() is not called at completion of the bios
> > > and zone locks are not properly unlocked. This triggers a hang when
> > > blktests block/004 is run for dm-crypt on zoned block devices [1]. To
> > > avoid the hang, call bio_set_dev() for every bio in crypt_map().
> > > 
> > > [1]
> > > 
> > > [ 6596.702977][T55017] run blktests block/004 at 2022-06-07 20:18:01
> > 
> > <snip>
> > 
> > Please refrain from putting stack traces in patch headers whenever
> > possible.  Really no need for this, especially given how long this one
> > is!
> > 
> > I revised the header as follows:
> > https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-5.19&id=cae0053631cd4b02bb882b53c7da20652b038049

I see. I will keep this in mind for my future patches. Thanks.

> >  
> > > Fixes: ca522482e3ea ("dm: pass NULL bdev to bio_alloc_clone")
> > > Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@xxxxxxx>
> > > ---
> > >  drivers/md/dm-crypt.c | 3 ++-
> > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
> > > index 159c6806c19b..c68523a89428 100644
> > > --- a/drivers/md/dm-crypt.c
> > > +++ b/drivers/md/dm-crypt.c
> > > @@ -3378,6 +3378,8 @@ static int crypt_map(struct dm_target *ti, struct bio *bio)
> > >  	struct dm_crypt_io *io;
> > >  	struct crypt_config *cc = ti->private;
> > >  
> > > +	bio_set_dev(bio, cc->dev->bdev);
> > > +
> > >  	/*
> > >  	 * If bio is REQ_PREFLUSH or REQ_OP_DISCARD, just bypass crypt queues.
> > >  	 * - for REQ_PREFLUSH device-mapper core ensures that no IO is in-flight
> > > @@ -3385,7 +3387,6 @@ static int crypt_map(struct dm_target *ti, struct bio *bio)
> > >  	 */
> > >  	if (unlikely(bio->bi_opf & REQ_PREFLUSH ||
> > >  	    bio_op(bio) == REQ_OP_DISCARD)) {
> > > -		bio_set_dev(bio, cc->dev->bdev);
> > >  		if (bio_sectors(bio))
> > >  			bio->bi_iter.bi_sector = cc->start +
> > >  				dm_target_offset(ti, bio->bi_iter.bi_sector);
> > > -- 
> > > 2.36.1
> > > 
> > 
> > BUT something isn't quite adding up with why you need this change
> > given commit ca522482e3ea has this compatibility code:
> > 
> > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > index 9650ba2075b8..d62f1354ecbf 100644
> > --- a/drivers/md/dm.c
> > +++ b/drivers/md/dm.c
> > @@ -581,7 +581,9 @@ static struct dm_io *alloc_io(struct mapped_device *md, struct bio *bio)
> >         struct dm_target_io *tio;
> >         struct bio *clone;
> > 
> > -       clone = bio_alloc_clone(bio->bi_bdev, bio, GFP_NOIO, &md->io_bs);
> > +       clone = bio_alloc_clone(NULL, bio, GFP_NOIO, &md->io_bs);
> > +       /* Set default bdev, but target must bio_set_dev() before issuing IO */
> > +       clone->bi_bdev = md->disk->part0;
> > 
> >         tio = clone_to_tio(clone);
> >         tio->flags = 0;
> > 
> > The clone bio passed to crypt_map() _should_ be the same as was passed
> > before commit ca522482e3ea (It gets set to md->disk->part0 rather than
> > bio->bi_bdev but they really should point to the same top-level DM
> > bdev).
> > 
> > So why is your extra override to have dm-crypt point to its underlying
> > data device important now?
> 
> Think the issue is not that the clone->bi_bdev isn't set.  It is that
> it is merely not changed from the md->disk->part0 default.
> 
> Commit ca522482e3ea added this to clone_endio:
> 
> if (likely(bio->bi_bdev != md->disk->part0)) {
>       ...
>       if (static_branch_unlikely(&zoned_enabled) &&
> 	  unlikely(blk_queue_is_zoned(q)))
> 	      dm_zone_endio(io, bio);
> }
> 
> So dm_zone_endio() isn't called unless you force dm-crypt (or any
> other target) to remap the clone bio to a different bdev.

Yes, I have same understanding. I assumed the bio->bi_bdev != md->disk->part0
can not be removed, and suggested to dm-crypt to remap all clone bios.

> 
> It is weird that a bio is completing without having been remapped by
> dm-crypt; but there could be any number of reasons why that hasn't
> happened.
> 
> Anyway, I think a correct fix needs to be to the logic in
> clone_endio().  Could be that its best to just remove the gating
> likely() conditional.
> 
> Can you please test this patch instead of your patch?

I tried this new patch and confirmed it avoids the hang. Looks good :)

Also, I ran my test set for the patch in same manner I did for the dm-5.19
branch, and observed no failure. Good. I found that this patch is already
pulled into v5.19-rc2, and a bit surprised. It looks too late to put my
Tested-by tag. Anyway, thanks for the fix.

-- 
Shin'ichiro Kawasaki

--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://listman.redhat.com/mailman/listinfo/dm-devel




[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux