Re: [PATCHv2 3/3] block: relax direct io memory alignment

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, May 18, 2022 at 07:59:36PM -0600, Keith Busch wrote:
> On Wed, May 18, 2022 at 06:53:11PM -0700, Eric Biggers wrote:
> > On Wed, May 18, 2022 at 07:00:39PM -0600, Keith Busch wrote:
> > > On Wed, May 18, 2022 at 05:14:49PM -0700, Eric Biggers wrote:
> > > > On Wed, May 18, 2022 at 10:11:31AM -0700, Keith Busch wrote:
> > > > > diff --git a/block/fops.c b/block/fops.c
> > > > > index b9b83030e0df..d8537c29602f 100644
> > > > > --- a/block/fops.c
> > > > > +++ b/block/fops.c
> > > > > @@ -54,8 +54,9 @@ static ssize_t __blkdev_direct_IO_simple(struct kiocb *iocb,
> > > > >  	struct bio bio;
> > > > >  	ssize_t ret;
> > > > >  
> > > > > -	if ((pos | iov_iter_alignment(iter)) &
> > > > > -	    (bdev_logical_block_size(bdev) - 1))
> > > > > +	if ((pos | iov_iter_count(iter)) & (bdev_logical_block_size(bdev) - 1))
> > > > > +		return -EINVAL;
> > > > > +	if (iov_iter_alignment(iter) & bdev_dma_alignment(bdev))
> > > > >  		return -EINVAL;
> > > > 
> > > > The block layer makes a lot of assumptions that bios can be split at any bvec
> > > > boundary.  With this patch, bios whose length isn't a multiple of the logical
> > > > block size can be generated by splitting, which isn't valid.
> > > 
> > > How? This patch ensures every segment is block size aligned.
> > 
> > No, it doesn't.  It ensures that the *total* length of each bio is logical block
> > size aligned.  It doesn't ensure that for the individual bvecs.  By decreasing
> > the required memory alignment to below the logical block size, you're allowing
> > logical blocks to span a page boundary.  Whenever the two pages involved aren't
> > physically contiguous, the data of the block will be split across two bvecs.
> 
> I'm aware that spanning pages can cause bad splits on the bi_max_vecs
> condition, but I believe it's well handled here. Unless I'm terribly confused,
> which is certainly possible, I think you may have missed this part of the
> patch:
> 
> @@ -1223,6 +1224,8 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
>  	pages += entries_left * (PAGE_PTRS_PER_BVEC - 1);
> 
>  	size = iov_iter_get_pages(iter, pages, LONG_MAX, nr_pages, &offset);
> +	if (size > 0)
> +		size = ALIGN_DOWN(size, queue_logical_block_size(q));
>  	if (unlikely(size <= 0))
>  		return size ? size : -EFAULT;
> 

That makes the total length of each "batch" of pages be a multiple of the
logical block size, but individual logical blocks within that batch can still be
divided into multiple bvecs in the loop just below it:

	for (left = size, i = 0; left > 0; left -= len, i++) {
		struct page *page = pages[i];

		len = min_t(size_t, PAGE_SIZE - offset, left);

		if (__bio_try_merge_page(bio, page, len, offset, &same_page)) {
			if (same_page)
				put_page(page);
		} else {
			if (WARN_ON_ONCE(bio_full(bio, len))) {
				bio_put_pages(pages + i, left, offset);
				return -EINVAL;
			}
			__bio_add_page(bio, page, len, offset);
		}
		offset = 0;
	}

- Eric



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux