On 8/9/2015 4:52 AM, Boaz Harrosh wrote: > On 08/06/2015 11:34 PM, Dave Chinner wrote: >> On Thu, Aug 06, 2015 at 10:52:47AM +0300, Boaz Harrosh wrote: >>> On 08/06/2015 06:24 AM, Dave Chinner wrote: >>>> On Wed, Aug 05, 2015 at 09:42:54PM -0400, Linda Knippers wrote: >>>>> On 08/05/2015 06:01 PM, Dave Chinner wrote: >>>>>> On Wed, Aug 05, 2015 at 04:19:08PM -0400, Jeff Moyer wrote: >>> <> >>>>>>> >>>>>>> I sat down with Linda to look into it, and the problem is that mkfs.xfs >>>>>>> sets the blocksize of the device to 512 (via BLKBSZSET), and then reads >>>>>>> from the last sector of the device. This results in dax_io trying to do >>>>>>> a page-sized I/O at 512 bytes from the end of the device. >>>>>> >>> >>> This part I do not understand. how is mkfs.xfs reading the sector? >>> Is it through open(/dev/pmem0,...) ? O_DIRECT? >> >> mkfs.xfs uses O_DIRECT. Only if open(O_DIRECT) fails or mkfs.xfs is >> told that it is working on an image file does it fall back to >> buffered IO. All of the XFS userspace tools work this way to prevent >> page cache pollution issues with read-once or write-once data during >> operation. >> > > Thanks, yes makes sense. This is a bug at the DAX implementation of > bdev. Since as you know with DAX there is no difference between > O_DIRECT and buffered, we must support any aligned IO. I bet it > should be something with bdev not giving 4K buffer-heads to dax.c. > > Or ... It might just be the infamous bug where the actual partition > they used was not 4k aligned on its start sector. So the last sector IO > after partition translation came out wrong. This bug then should be > fixed by: https://lists.01.org/pipermail/linux-nvdimm/2015-July/001555.html > by:Vishal Verma > > Vishal I think we should add CC: stable@xxxxxxxxxxxxxxx to your patch > because of these fdisk bugs. That patch does cause 'mkfs -t xfs' to work. Before: $ sudo mkfs -t xfs -f /dev/pmem3 meta-data=/dev/pmem3 isize=256 agcount=4, agsize=524288 blks = sectsz=512 attr=2, projid32bit=1 = crc=0 finobt=0 data = bsize=4096 blocks=2097152, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 ftype=0 log =internal log bsize=4096 blocks=2560, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 mkfs.xfs: read failed: Numerical result out of range After: $ sudo mkfs -t xfs -f /dev/pmem3 meta-data=/dev/pmem3 isize=256 agcount=4, agsize=524288 blks = sectsz=4096 attr=2, projid32bit=1 = crc=0 finobt=0 data = bsize=4096 blocks=2097152, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 ftype=0 log =internal log bsize=4096 blocks=2560, version=2 = sectsz=4096 sunit=1 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 $ cat /sys/block/pmem3/queue/logical_block_size 512 $ cat /sys/block/pmem3/queue/physical_block_size 4096 $ cat /sys/block/pmem3/queue/hw_sector_size 512 $ cat /sys/block/pmem3/queue/minimum_io_size 4096 Previously physical_block_size was 512 and minimum_io_size was 0. What about logical_block_size and hw_sector_size still being 512? So do we want to change pmem rather than changing DAX? -- ljk > >> Cheers, >> Dave. > > Thanks > Boaz > -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html