Re: [PATCH 2/2] blkdev: __blkdev_direct_IO_simple: make sure to fill up the bio

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jul 19, 2018 at 11:39:18AM +0200, Martin Wilck wrote:
> bio_iov_iter_get_pages() returns only pages for a single non-empty
> segment of the input iov_iter's iovec. This may be much less than the number
> of pages __blkdev_direct_IO_simple() is supposed to process. Call
> bio_iov_iter_get_pages() repeatedly until either the requested number
> of bytes is reached, or bio.bi_io_vec is exhausted. If this is not done,
> short writes or reads may occur for direct synchronous IOs with multiple
> iovec slots (such as generated by writev()). In that case,
> __generic_file_write_iter() falls back to buffered writes, which
> has been observed to cause data corruption in certain workloads.
> 
> Note: if segments aren't page-aligned in the input iovec, this patch may
> result in multiple adjacent slots of the bi_io_vec array to reference the same
> page (the byte ranges are guaranteed to be disjunct if the preceding patch is
> applied). We haven't seen problems with that in our and the customer's
> tests. It'd be possible to detect this situation and merge bi_io_vec slots
> that refer to the same page, but I prefer to keep it simple for now.
> 
> Fixes: 72ecad22d9f1 ("block: support a full bio worth of IO for simplified bdev direct-io")
> Signed-off-by: Martin Wilck <mwilck@xxxxxxxx>
> ---
>  fs/block_dev.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/block_dev.c b/fs/block_dev.c
> index 0dd87aa..41643c4 100644
> --- a/fs/block_dev.c
> +++ b/fs/block_dev.c
> @@ -221,7 +221,12 @@ __blkdev_direct_IO_simple(struct kiocb *iocb, struct iov_iter *iter,
>  
>  	ret = bio_iov_iter_get_pages(&bio, iter);
>  	if (unlikely(ret))
> -		return ret;
> +		goto out;
> +
> +	while (ret == 0 &&
> +	       bio.bi_vcnt < bio.bi_max_vecs && iov_iter_count(iter) > 0)
> +		ret = bio_iov_iter_get_pages(&bio, iter);
> +
>  	ret = bio.bi_iter.bi_size;
>  
>  	if (iov_iter_rw(iter) == READ) {
> @@ -250,6 +255,7 @@ __blkdev_direct_IO_simple(struct kiocb *iocb, struct iov_iter *iter,
>  		put_page(bvec->bv_page);
>  	}
>  
> +out:
>  	if (vecs != inline_vecs)
>  		kfree(vecs);
>

You might put the 'vecs' leak fix into another patch, and resue the
current code block for that.

Looks all users of bio_iov_iter_get_pages() need this kind of fix, so
what do you think about the following way?

diff --git a/block/bio.c b/block/bio.c
index f3536bfc8298..23dd4c163dfc 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -904,15 +904,7 @@ int bio_add_page(struct bio *bio, struct page *page,
 }
 EXPORT_SYMBOL(bio_add_page);
 
-/**
- * bio_iov_iter_get_pages - pin user or kernel pages and add them to a bio
- * @bio: bio to add pages to
- * @iter: iov iterator describing the region to be mapped
- *
- * Pins as many pages from *iter and appends them to @bio's bvec array. The
- * pages will have to be released using put_page() when done.
- */
-int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
+static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
 {
 	unsigned short nr_pages = bio->bi_max_vecs - bio->bi_vcnt;
 	struct bio_vec *bv = bio->bi_io_vec + bio->bi_vcnt;
@@ -951,6 +943,28 @@ int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
 	iov_iter_advance(iter, size);
 	return 0;
 }
+
+/**
+ * bio_iov_iter_get_pages - pin user or kernel pages and add them to a bio
+ * @bio: bio to add pages to
+ * @iter: iov iterator describing the region to be mapped
+ *
+ * Pins as many pages from *iter and appends them to @bio's bvec array. The
+ * pages will have to be released using put_page() when done.
+ */
+int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
+{
+	int ret;
+	unsigned int size;
+
+	do {
+		size = bio->bi_iter.bi_size;
+		ret = __bio_iov_iter_get_pages(bio, iter);
+	} while (!bio_full(bio) && iov_iter_count(iter) > 0 &&
+			bio->bi_iter.bi_size > size);
+
+	return bio->bi_iter.bi_size > 0 ? 0 : ret;
+}
 EXPORT_SYMBOL_GPL(bio_iov_iter_get_pages);
 
 static void submit_bio_wait_endio(struct bio *bio)

Thanks, 
Ming



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux