On 08/06/2015 06:24 AM, Dave Chinner wrote: > On Wed, Aug 05, 2015 at 09:42:54PM -0400, Linda Knippers wrote: >> On 08/05/2015 06:01 PM, Dave Chinner wrote: >>> On Wed, Aug 05, 2015 at 04:19:08PM -0400, Jeff Moyer wrote: <> >>>> >>>> I sat down with Linda to look into it, and the problem is that mkfs.xfs >>>> sets the blocksize of the device to 512 (via BLKBSZSET), and then reads >>>> from the last sector of the device. This results in dax_io trying to do >>>> a page-sized I/O at 512 bytes from the end of the device. >>> This part I do not understand. how is mkfs.xfs reading the sector? Is it through open(/dev/pmem0,...) ? O_DIRECT? If so then yes the inode of /dev/pmem0 is IS_DAX() and will try to use the dax.c stuff. (I think, which Kernel?) Which means this is a bug. >>> Right - we have to be able to do IO to that last sector, so this is >>> a sanity check to tell if the block dev is large enough. The XFS >>> kernel code does the same end-of-device sector read when the >>> filesystem is mounted, too. >>> >>>> bdev_direct_access, receiving this bogus pos/size combo, returns >>>> -ERANGE: >>>> >>>> if ((sector + DIV_ROUND_UP(size, 512)) > >>>> part_nr_sects_read(bdev->bd_part)) >>>> return -ERANGE; >>>> >>>> Given that file systems supporting dax refuse to mount with a blocksize >>>> != page size, I'm guessing this is sort of expected behavior. However, >>>> we really shouldn't be breaking direct I/O on pmem devices. >>> No this is a BUG. read/write buffered/direct to an IS_DAX() inode should be able to be of any alignment size. Since with DAX buffered/direct is exact same code path and buffered IO expects any size IO. This is probably a bug in the DAX handling of the bdev-inode. Let me test this. I will send a fix ASAP. <> >>> the output of: >>> >>> /sys/block/pmem0/queue/logical_block_size >> 512 >> >>> /sys/block/pmem0/queue/physical_block_size >> 512 >> There is a pending fix for this. Do you need it sent to stable ? >>> /sys/block/pmem0/queue/hw_sector_size >> 512 >> >>> /sys/block/pmem0/queue/minimum_io_size >> 512 >> >>> /sys/block/pmem0/queue/optimal_io_size >> 0 Thanks Boaz -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html