On Tue, Oct 24 2023 at 3:18P -0400, Mikulas Patocka <mpatocka@xxxxxxxxxx> wrote: > > > On Tue, 24 Oct 2023, Mike Snitzer wrote: > > > For the benefit of others, since this patch was posted to the old > > dm-devel list, here is the original patch proposal: > > https://patchwork.kernel.org/project/dm-devel/patch/15ca26cc-174a-d4e8-9780-d09f8e5a6ea5@xxxxxxxxxx/ > > > > On Tue, Oct 03 2023 at 11:30P -0400, > > Mikulas Patocka <mpatocka@xxxxxxxxxx> wrote: > > > > > Hi > > > > > > Here I'm sending that patch for REQ_NOWAIT for review. > > > > > > It is not tested, except for some trivial tests that involve logical > > > volume activation. > > > > At a minimum we need to test with the simple test code Jens provided > > in commit a9ce385344f9 ("dm: don't attempt to queue IO under RCU protection") > > The question is - how do we simulate the memory allocation failures? Do we > need to add some test harness that will randomly return NULL to these > allocations? Or will we use the kernel fault-injection framework? Will think about it (and do research). Maybe others have suggestions? > > Below you'll see I insulated dm_split_and_process_bio() from ever > > checking ci.nowait_failed -- prefer methods like __send_empty_flush > > that are called from dm_split_and_process_bio() return blk_status_t > > (like __process_abnormal_io and __split_and_process_bio do). > > > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c > > index d6fbbaa7600b..2a9ff269c28b 100644 > > --- a/drivers/md/dm.c > > +++ b/drivers/md/dm.c > > @@ -572,18 +572,16 @@ static void dm_end_io_acct(struct dm_io *io) > > dm_io_acct(io, true); > > } > > > > -static struct dm_io *alloc_io(struct clone_info *ci, struct mapped_device *md, struct bio *bio) > > +static struct dm_io *alloc_io(struct mapped_device *md, struct bio *bio) > > { > > struct dm_io *io; > > struct dm_target_io *tio; > > struct bio *clone; > > > > - if (unlikely(ci->is_nowait)) { > > + if (unlikely(bio->bi_opf & REQ_NOWAIT)) { > > clone = bio_alloc_clone(NULL, bio, GFP_NOWAIT, &md->mempools->io_bs); > > - if (!clone) { > > - ci->nowait_failed = true; > > - return NULL; > > - } > > + if (!clone) > > + return PTR_ERR(-EAGAIN); > > :s/PTR_ERR/ERR_PTR/ Ha, not sure how I inverted it.. but thanks. > If -EAGAIN is the only possible error code, should we return NULL instead? Yeah, thought about that. Think returning NULL is fine and if/when other reasons for failure possible we can extend as needed. > > @@ -1813,17 +1815,17 @@ static void dm_split_and_process_bio(struct mapped_device *md, > > return; > > } > > > > - init_clone_info(&ci, md, map, bio, is_abnormal); > > - if (unlikely(ci.nowait_failed)) { > > - error = BLK_STS_AGAIN; > > - goto out; > > + io = alloc_io(md, bio); > > + if (unlikely(IS_ERR(io) && ERR_PTR(io) == -EAGAIN)) { > :s/ERR_PTR/PTR_ERR/ Yes. > > + /* Unable to do anything without dm_io. */ > > + bio_wouldblock_error(bio); > > + return; > > I would use "if (unlikely(IS_ERR(io)) { > bio->bi_status = errno_to_blk_status(PTR_ERR(io)); > bio_set_flag(bio, BIO_QUIET); > bio_endio(bio); > return;", > so that all possible error codes (that could be added in the future) are > propageted. > > Or, if you change alloc_io to return NULL, you can leave > bio_wouldblock_error here. Yeah, will just return NULL and stick with bio_wouldblock_error() for now. Thanks, Mike