On Thu, 2022-06-23 at 16:32 -0400, Eric Farman wrote: > On Thu, 2022-06-23 at 13:11 -0600, Keith Busch wrote: > > On Thu, Jun 23, 2022 at 12:51:08PM -0600, Keith Busch wrote: > > > On Thu, Jun 23, 2022 at 02:29:13PM -0400, Eric Farman wrote: > > > > On Fri, 2022-06-10 at 12:58 -0700, Keith Busch wrote: > > > > > From: Keith Busch <kbusch@xxxxxxxxxx> > > > > > > > > > > Use the address alignment requirements from the block_device > > > > > for > > > > > direct > > > > > io instead of requiring addresses be aligned to the block > > > > > size. > > > > > > > > Hi Keith, > > > > > > > > Our s390 PV guests recently started failing to boot from a > > > > -next > > > > host, > > > > and git blame brought me here. > > > > > > > > As near as I have been able to tell, we start tripping up on > > > > this > > > > code > > > > from patch 9 [1] that gets invoked with this patch: > > > > > > > > > for (k = 0; k < i->nr_segs; k++, skip = 0) { > > > > > size_t len = i->iov[k].iov_len - skip; > > > > > > > > > > if (len > size) > > > > > len = size; > > > > > if (len & len_mask) > > > > > return false; > > > > > > > > The iovec we're failing on has two segments, one with a len of > > > > x200 > > > > (and base of x...000) and another with a len of xe00 (and a > > > > base > > > > of > > > > x...200), while len_mask is of course xfff. > > > > > > > > So before I go any further on what we might have broken, do you > > > > happen > > > > to have any suggestions what might be going on here, or > > > > something > > > > I > > > > should try? > > > > > > Thanks for the notice, sorry for the trouble. This check wasn't > > > intended to > > > have any difference from the previous code with respect to the > > > vector lengths. > > > > > > Could you tell me if you're accessing this through the block > > > device > > > direct-io, > > > or through iomap filesystem? > > Reasonably certain the failure's on iomap. I'd reverted the subject > patch from next-20220622 and got things in working order. > > > If using iomap, the previous check was this: > > > > unsigned int blkbits = > > blksize_bits(bdev_logical_block_size(iomap->bdev)); > > unsigned int align = iov_iter_alignment(dio->submit.iter); > > ... > > if ((pos | length | align) & ((1 << blkbits) - 1)) > > return -EINVAL; > > > > > ... > > The result of "iov_iter_alignment()" would include "0xe00 | 0x200" > > in > > your > > example, and checked against 0xfff should have been failing prior > > to > > this > > patch. Unless I'm missing something... > > Nope, you're not. I didn't look back at what the old check was doing, > just saw "0xe00 and 0x200" and thought "oh there's one page" instead > of > noting the code was or'ing them. My bad. > > That was the last entry in my trace before the guest gave up, as > everything else through this code up to that point seemed okay. I'll > pick up the working case and see if I can get a clearer picture > between > the two. Looking over the trace again, I realize I did dump iov_iter_alignment() as a comparator, and I see one pass through that had a non-zero response but bdev_iter_is_aligned() returned true... count = x1000 iov_offset = x0 nr_segs = 1 iov_len = x1000 (len_mask = xfff) iov_base = x...200 (addr_mask = x1ff) That particular pass through is in the middle of the stuff it tried to do, so I don't know if that's the cause or not but it strikes me as unusual. Will look into that tomorrow and report back. >