RE: regression introduced by "block: Add support for DAX reads/writes to block devices"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I think I see the problem.  I'm kind of wrapped up in other things right now; can you try replacing the line in dax_io():

-				bh->b_size = PAGE_ALIGN(end - pos);
+				bh->b_size = ALIGN(end - pos, 1 << blkbits);

-----Original Message-----
From: Jeff Moyer [mailto:jmoyer@xxxxxxxxxx] 
Sent: Wednesday, August 05, 2015 1:19 PM
To: Wilcox, Matthew R; linda.knippers@xxxxxx
Cc: linux-kernel@xxxxxxxxxxxxxxx; linux-fsdevel@xxxxxxxxxxxxxxx
Subject: regression introduced by "block: Add support for DAX reads/writes to block devices"

Hi, Matthew,

Linda Knippers noticed that commit (bbab37ddc20b) breaks mkfs.xfs:

# mkfs -t xfs -f /dev/pmem0
meta-data=/dev/pmem0             isize=256    agcount=4, agsize=524288 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=0        finobt=0
data     =                       bsize=4096   blocks=2097152, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=0
log      =internal log           bsize=4096   blocks=2560, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
mkfs.xfs: read failed: Numerical result out of range

I sat down with Linda to look into it, and the problem is that mkfs.xfs
sets the blocksize of the device to 512 (via BLKBSZSET), and then reads
from the last sector of the device.  This results in dax_io trying to do
a page-sized I/O at 512 bytes from the end of the device.
bdev_direct_access, receiving this bogus pos/size combo, returns
-ERANGE:

	if ((sector + DIV_ROUND_UP(size, 512)) >
					part_nr_sects_read(bdev->bd_part))
		return -ERANGE;

Given that file systems supporting dax refuse to mount with a blocksize
!= page size, I'm guessing this is sort of expected behavior.  However,
we really shouldn't be breaking direct I/O on pmem devices.

So, what do you want to do?  We could make the pmem device's logical
block size fixed at the sytem page size.  Or, we could modify the dax
code to work with blocksize < pagesize.  Or, we could continue using the
direct I/O codepath for direct block device access.  What do you think?

Thaks,
Jeff and Linda
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux