Re: [PATCH v4 3/4] block: add bio_iov_iter_get_all_pages() helper

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jul 20, 2018 at 06:54:48PM +0200, Martin Wilck wrote:
> On Sat, 2018-07-21 at 00:16 +0800, Ming Lei wrote:
> > On Fri, Jul 20, 2018 at 03:05:51PM +0200, Martin Wilck wrote:
> > > bio_iov_iter_get_pages() only adds pages for the next non-zero
> > > segment from the iov_iter to the bio. Some callers prefer to
> > > obtain as many pages as would fit into the bio, with proper
> > > rollback in case of failure. Add bio_iov_iter_get_all_pages()
> > > for this purpose.
> > > 
> > > Signed-off-by: Martin Wilck <mwilck@xxxxxxxx>
> > > ---
> > >  block/bio.c         | 43
> > > ++++++++++++++++++++++++++++++++++++++++++-
> > >  include/linux/bio.h |  1 +
> > >  2 files changed, 43 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/block/bio.c b/block/bio.c
> > > index 489a430..693eb3b 100644
> > > --- a/block/bio.c
> > > +++ b/block/bio.c
> > > @@ -907,8 +907,10 @@ EXPORT_SYMBOL(bio_add_page);
> > >   * @bio: bio to add pages to
> > >   * @iter: iov iterator describing the region to be mapped
> > >   *
> > > - * Pins as many pages from *iter and appends them to @bio's bvec
> > > array. The
> > > + * Pins pages from *iter and appends them to @bio's bvec array.
> > > The
> > >   * pages will have to be released using put_page() when done.
> > > + * For multi-segment *iter, this function only adds pages from the
> > > + * the next non-empty segment of the iov iterator.
> > >   */
> > >  int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
> > >  {
> > > @@ -949,6 +951,45 @@ int bio_iov_iter_get_pages(struct bio *bio,
> > > struct iov_iter *iter)
> > >  }
> > >  EXPORT_SYMBOL_GPL(bio_iov_iter_get_pages);
> > >  
> > > +/**
> > > + * bio_iov_iter_get_all_pages - pin user or kernel pages and add
> > > them to a bio
> > > + * @bio: bio to add pages to
> > > + * @iter: iov iterator describing the region to be mapped
> > > + *
> > > + * Pins pages from *iter and appends them to @bio's bvec array.
> > > The
> > > + * pages will have to be released using put_page() when done.
> > > + * This function adds as many pages as possible to a bio.
> > > + * If this function encounters an error, it unpins the pages it
> > > has
> > > + * pinned before, leaving previously pinned pages untouched.
> > > + */
> > > +int bio_iov_iter_get_all_pages(struct bio *bio, struct iov_iter
> > > *iter)
> > > +{
> > > +	unsigned short orig_vcnt = bio->bi_vcnt;
> > > +
> > > +	do {
> > > +		int ret = bio_iov_iter_get_pages(bio, iter);
> > > +
> > > +		if (unlikely(ret)) {
> > > +			struct bio_vec *bvec;
> > > +			unsigned short i;
> > > +
> > > +			bio_for_each_segment_all(bvec, bio, i) {
> > > +				if (i >= orig_vcnt) {
> > > +					put_page(bvec->bv_page);
> > > +					bvec->bv_page = NULL;
> > > +					bvec->bv_len = 0;
> > > +					bvec->bv_offset = 0;
> > > +				}
> > > +			}
> > > +			bio->bi_vcnt = orig_vcnt;
> > > +			return ret;
> > > +		}
> > > +	} while (iov_iter_count(iter) && !bio_full(bio));
> > 
> > The failure handling part(release pages) may be moved out of this
> > helper, so usage of this helper can be aligned with
> > bio_iov_iter_get_pages().
> 
> I wrote the failure handling precisely for being compatible with
> bio_iov_iter_get_pages(), which requires no rollback if it returns an
> error. If we don't do this, we have to add extra error handling code to
> every caller. It was the issue you raised with my v3 submission...
> apparently I misunderstood you.
> 
> What should happen if the new handler encounters an error from
> bio_iov_iter_get_pages() in the 2nd or later iteration?
> 
>  1 return success, so that the caller doesn't realize there was a
> problem,
>  2 return error and roll back bio changes, as I implemented it here,
>  3 return error and keep the already allocated pages.
> 
> You seem to support option 3. But that leaves it to the caller to 
> differentiate this from a failure with zero allocated pages, and clean
> up appropriately. I not sure if that's wise, and it's for sure
> different from bio_iov_iter_get_pages()' behavior. Or what am I
> missing?
> > BTW, I agree with Christoph, we may just fix/improve
> > bio_iov_iter_get_pages()
> > for all users.
> 
> Ok, thanks for confirming.

OK, if you follow this suggestion, the pinned pages may be released
in bio_iov_iter_get_pages() like what this patch does, but the bio_endio(bio)
in failure handler of __blkdev_direct_IO() has to be handled in the
following way for avoiding double release:

1) if this bio is allocated from &blkdev_dio_pool, the bio_endio() need
to be removed

2) otherwise, the bio_endio() need to be replaced with bio_put().

Frankly speaking, after your patch is in, seems it is fine to allocate
single bio for doing the dio in __blkdev_direct_IO(), given the passed
'max_pages' is <= BIO_MAX_PAGES. Then __blkdev_direct_IO() can be
simplified much.

But, both current __blkdev_direct_IO() and iomap_dio_actor() supports
short dio, do you think there is same issue in the two with yours? Or
do we need to support it in the new bio_iov_iter_get_pages()?


Thanks,
Ming



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux