Re: [PATCH 2/2] blkdev: __blkdev_direct_IO_simple: make sure to fill up the bio

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu 19-07-18 19:04:46, Ming Lei wrote:
> On Thu, Jul 19, 2018 at 11:39:18AM +0200, Martin Wilck wrote:
> > bio_iov_iter_get_pages() returns only pages for a single non-empty
> > segment of the input iov_iter's iovec. This may be much less than the number
> > of pages __blkdev_direct_IO_simple() is supposed to process. Call
> > bio_iov_iter_get_pages() repeatedly until either the requested number
> > of bytes is reached, or bio.bi_io_vec is exhausted. If this is not done,
> > short writes or reads may occur for direct synchronous IOs with multiple
> > iovec slots (such as generated by writev()). In that case,
> > __generic_file_write_iter() falls back to buffered writes, which
> > has been observed to cause data corruption in certain workloads.
> > 
> > Note: if segments aren't page-aligned in the input iovec, this patch may
> > result in multiple adjacent slots of the bi_io_vec array to reference the same
> > page (the byte ranges are guaranteed to be disjunct if the preceding patch is
> > applied). We haven't seen problems with that in our and the customer's
> > tests. It'd be possible to detect this situation and merge bi_io_vec slots
> > that refer to the same page, but I prefer to keep it simple for now.
> > 
> > Fixes: 72ecad22d9f1 ("block: support a full bio worth of IO for simplified bdev direct-io")
> > Signed-off-by: Martin Wilck <mwilck@xxxxxxxx>
> > ---
> >  fs/block_dev.c | 8 +++++++-
> >  1 file changed, 7 insertions(+), 1 deletion(-)
> > 
> > diff --git a/fs/block_dev.c b/fs/block_dev.c
> > index 0dd87aa..41643c4 100644
> > --- a/fs/block_dev.c
> > +++ b/fs/block_dev.c
> > @@ -221,7 +221,12 @@ __blkdev_direct_IO_simple(struct kiocb *iocb, struct iov_iter *iter,
> >  
> >  	ret = bio_iov_iter_get_pages(&bio, iter);
> >  	if (unlikely(ret))
> > -		return ret;
> > +		goto out;
> > +
> > +	while (ret == 0 &&
> > +	       bio.bi_vcnt < bio.bi_max_vecs && iov_iter_count(iter) > 0)
> > +		ret = bio_iov_iter_get_pages(&bio, iter);
> > +
> >  	ret = bio.bi_iter.bi_size;
> >  
> >  	if (iov_iter_rw(iter) == READ) {
> > @@ -250,6 +255,7 @@ __blkdev_direct_IO_simple(struct kiocb *iocb, struct iov_iter *iter,
> >  		put_page(bvec->bv_page);
> >  	}
> >  
> > +out:
> >  	if (vecs != inline_vecs)
> >  		kfree(vecs);
> >
> 
> You might put the 'vecs' leak fix into another patch, and resue the
> current code block for that.
> 
> Looks all users of bio_iov_iter_get_pages() need this kind of fix, so
> what do you think about the following way?

No. AFAICT all the other users of bio_iov_iter_get_pages() are perfectly
fine with it returning less pages and they loop appropriately.

								Honza
-- 
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR



[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux