On Fri, Jul 01, 2022 at 01:28:13PM -0600, Keith Busch wrote: > On Fri, Jul 01, 2022 at 08:08:37PM +0100, Al Viro wrote: > > On Fri, Jul 01, 2022 at 12:38:36PM -0600, Keith Busch wrote: > > > > size = iov_iter_get_pages(iter, pages, LONG_MAX, nr_pages, &offset); > > > > - if (size > 0) > > > > - size = ALIGN_DOWN(size, bdev_logical_block_size(bio->bi_bdev)); > > > > if (unlikely(size <= 0)) > > > > return size ? size : -EFAULT; > > > > + nr_pages = DIV_ROUND_UP(offset + size, PAGE_SIZE); > > > > > > > > + size = ALIGN_DOWN(size, bdev_logical_block_size(bio->bi_bdev)); > > > > > > We still need to return EFAULT if size becomes 0 because that's the only way > > > bio_iov_iter_get_pages()'s loop will break out in this condition. > > > > I really don't like these calling conventions ;-/ > > No argument here; I'm just working in the space as I found it. :) > > > What do you want to happen if we have that align-down to reduce size? > > IOW, what should be the state after that loop in the caller? > > We fill up the bio to bi_max_vecs. If there are more pages than vectors, then > the bio is submitted as-is with the partially drained iov_iter. The remainder > of the iov is left for a subsequent bio to deal with. > > The ALIGN_DOWN is required because I've replaced the artificial kernel aligment > with the underlying hardware's alignment. The hardware's alignment is usually > smaller than a block size. If the last bvec has a non-block aligned offset, > then it has to be truncated down, which could result in a 0 size when bi_vcnt > is already non-zero. If that happens, we just submit the bio as-is, and > allocate a new one to deal with the rest of the iov. Wait a sec. Looks like you are dealing with the round-down in the wrong place - it applies to the *total* you've packed into the bio, no matter how it is distributed over the segments you've gathered it from. Looks like it would be more natural to handle it after the loop in the caller, would it not? I.e. while bio is not full grab pages if got nothing break pack pages into bio if can't add a page (bio_add_hw_page() failure) drop the ones not shoved there break see how much had we packed into the sucker if not a multiple of logical block size trim the bio, dropping what needs to be dropped iov_iter_revert() if nothing's packed return -EINVAL if it was a failed bio_add_hw_page() return -EFAULT otherwise return 0