On 2021/03/24 19:09, Johannes Thumshirn wrote: > On 24/03/2021 08:13, Johannes Thumshirn wrote: >> On 24/03/2021 08:11, Christoph Hellwig wrote: >>> On Wed, Mar 24, 2021 at 07:07:27AM +0000, Johannes Thumshirn wrote: >>>>>> if (iov_iter_is_bvec(iter)) { >>>>>> - if (WARN_ON_ONCE(bio_op(bio) == REQ_OP_ZONE_APPEND)) >>>>>> - return -EINVAL; >>>>>> + if (bio_op(bio) == REQ_OP_ZONE_APPEND) { >>>>>> + struct request_queue *q = bio->bi_bdev->bd_disk->queue; >>>>>> + unsigned int max_append = >>>>>> + queue_max_zone_append_sectors(q) << 9; >>>>>> + >>>>>> + if (WARN_ON_ONCE(iter->count > max_append)) >>>>>> + return -EINVAL; >>>>>> + } >>>>> >>>>> That is not correct. bio_iov_iter_get_pages just fills the bio as far >>>>> as we can, and then returns 0 for the next call to continue. Basically >>>>> what you want here is a partial version of bio_iov_bvec_set. >>>>> >>>> >>>> Isn't that what I did? The above is checking if we have REQ_OP_ZONE_APPEND and >>>> then returns EINVAL if iter->count is bigger than queue_max_zone_append_sectors(). >>>> If the check doesn't fail, its going to call bio_iov_bvec_set(). >>> >>> And that is the problem. It should not fail, the payload is decoupled >>> from the max_append size. >>> >>> Doing the proper thing is not too hard as described above - make sure >>> the bi_iter points to only the chunk of iter passed in that fits, and >>> only advance the passed in iter by that amount. >>> >> >> Ah got it now, >> thanks >> > > Stupid question, but wouldn't it be sufficient if I did (this can still be > simplified): > > diff --git a/block/bio.c b/block/bio.c > index 26b7f721cda8..9c529b2db8fa 100644 > --- a/block/bio.c > +++ b/block/bio.c > @@ -964,6 +964,16 @@ static int bio_iov_bvec_set(struct bio *bio, struct iov_iter *iter) > return 0; > } > > +static int bio_iov_append_bvec_set(struct bio *bio, struct iov_iter *iter) > +{ > + struct request_queue *q = bio->bi_bdev->bd_disk->queue; > + unsigned int max_append = queue_max_zone_append_sectors(q) << 9; > + > + iov_iter_truncate(iter, max_append); Why truncate ? the caller of bio_iov_iter_get_pages() will see a shorter iter after this return and will not loop over to send the remaining. I think you need a special bio_iov_bvec_set() or you need to patch it so that the bio size is at most max_append instead of iter->count. Same for the segments count: the bio must be set with the number of segments up to max_append instead of iter->nr_segs. Then you can do: iov_iter_advance(iter, max_append); before returning, like bio_iov_bvec_set() does. No ? > + > + return bio_iov_bvec_set(bio, iter); > +} > + > #define PAGE_PTRS_PER_BVEC (sizeof(struct bio_vec) / sizeof(struct page *)) > > /** > @@ -1094,8 +1104,8 @@ int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter) > int ret = 0; > > if (iov_iter_is_bvec(iter)) { > - if (WARN_ON_ONCE(bio_op(bio) == REQ_OP_ZONE_APPEND)) > - return -EINVAL; > + if (bio_op(bio) == REQ_OP_ZONE_APPEND) > + return bio_iov_append_bvec_set(bio, iter); > return bio_iov_bvec_set(bio, iter); > } > > The above had a successful xfstests run and several fio --ioengine=splice runs. > -- Damien Le Moal Western Digital Research