On Mon, Jul 18, 2016 at 1:54 PM, Jan Tulak <jtulak@xxxxxxxxxx> wrote: > On Mon, Jul 18, 2016 at 1:47 PM, Eryu Guan <eguan@xxxxxxxxxx> wrote: >> On Mon, Jul 18, 2016 at 01:29:47PM +0200, Jan Tulak wrote: >>> On Mon, Jul 18, 2016 at 1:30 AM, Dave Chinner <david@xxxxxxxxxxxxx> wrote: >>> > On Sat, Jul 16, 2016 at 05:33:58PM +0800, Eryu Guan wrote: >>> >> On Thu, Jul 14, 2016 at 02:43:34PM +0200, Jan Tulak wrote: >>> >> > +do_mkfs_fail -l lazy-count=1garbage $SCRATCH_DEV >>> >> > +do_mkfs_fail -l lazy-count=2 $SCRATCH_DEV >>> >> > +do_mkfs_fail -l lazy-count=0 -m crc=1 $SCRATCH_DEV >>> >> > +do_mkfs_fail -l version=1 -m crc=1 $SCRATCH_DEV >>> >> >>> >> This test fails in my DAX testing, where SCRATCH_DEV is ramdisk. The >>> >> mkfs itself should fail, but it passed. Log version 2 was used >>> >> automatically, instead of prompting "V2 logs always enabled for CRC >>> >> enabled filesytems" >>> >> >>> >> [root@dhcp-66-86-11 xfstests]# mkfs -t xfs -f -l version=1 -m crc=1 /dev/ram0 >>> >> meta-data=/dev/ram0 isize=512 agcount=1, agsize=4096 blks >>> >> = sectsz=4096 attr=2, projid32bit=1 >>> >> = crc=1 finobt=1, sparse=0 >>> >> data = bsize=4096 blocks=4096, imaxpct=25 >>> >> = sunit=0 swidth=0 blks >>> >> naming =version 2 bsize=4096 ascii-ci=0 ftype=1 >>> >> log =internal log bsize=4096 blocks=1605, version=2 >>> >> = sectsz=4096 sunit=1 blks, lazy-count=1 >>> >> realtime =none extsz=4096 blocks=0, rtextents=0 >>> >> >>> >> Is it a mkfs.xfs bug or the test case should handle the special case? >>> > >>> > Looks like it might be a side effect of using a 4k sector size. v1 >>> > logs only supported 512 byte sectors, so it's entirely possible that >>> > the sector size is silently overriding the log version >>> > specification. Probably should be fixed in mkfs. >>> > >>> > >>> >>> I tried to duplicate this, but in my config it didn't failed - how did >>> you create the ramdisk? >> >> I think you need to test on a 4k sector size disk. I use scsi_debug to >> simulate physical 4k sector disk to reproduce this: >> >> [root@dhcp-66-86-11 xfsprogs-dev]# modprobe -r scsi_debug >> [root@dhcp-66-86-11 xfsprogs-dev]# modprobe scsi_debug dev_size_mb=128 physblk_exp=3 >> [root@dhcp-66-86-11 xfsprogs-dev]# blockdev --getbsz --getpbsz --getss /dev/sdc >> 4096 >> 4096 >> 512 >> [root@dhcp-66-86-11 xfsprogs-dev]# mkfs -t xfs -l version=1 -m crc=1 /dev/sdc >> meta-data=/dev/sdc isize=512 agcount=4, agsize=8192 blks >> = sectsz=4096 attr=2, projid32bit=1 >> = crc=1 finobt=1, sparse=0 >> data = bsize=4096 blocks=32768, imaxpct=25 >> = sunit=0 swidth=0 blks >> naming =version 2 bsize=4096 ascii-ci=0 ftype=1 >> log =internal log bsize=4096 blocks=1605, version=2 >> = sectsz=4096 sunit=1 blks, lazy-count=1 >> realtime =none extsz=4096 blocks=0, rtextents=0 >> >> If you remove the "physblk_exp=3" at modprobe time, mkfs failed as >> expected. >> > > Ah, thanks. :-) Now I can reproduce it and see what happens. And the culprit is in mkfs, some forty lines before the crc & log version check: 2026 ⇥ } else if (lsectorsize > XFS_MIN_SECTORSIZE && !lsu && !lsunit) { 2027 ⇥ ⇥ lsu = blocksize; 2028 ⇥ ⇥ sb_feat.log_version = 2; 2029 ⇥ } The possible solutions I can think of are: 1) Make a more complicated check. This would change just a line or two, but most likely, we would test the same thing multiple times and added unnecessary complexity. 2) Move the crc checks into an earlier place. The only value that can be changed in crc checks from default is finobt, and finobt is not read nor modified between argument parsing and the crc check. This looks like a simple and safe thing, but it will move some ~60 lines. I tested moving the crc testing block right behind this: 1968 ⇥ memset(&ft, 0, sizeof(ft)); 1969 ⇥ get_topology(&xi, &ft, force_overwrite); And it works. I didn't run full test suite yet, though. 3) Change the silent autoupdating of log version. The default is 2 and if user explicitly states v1, then we should either warn or fail entirely if it is not possible to make such fs. 4) Do nothing with mkfs and instead, update the test to check the sector size and expect pass/fail... But this issue boils down to the question "what is the correct order of doing things"? Should we try to autosolve what we can at first, and check for remaining issues after that? Or should we check for issues with the input ASAP, even if it can be solved by updating the input to match the physical device? Right now, it looks like "someone wrote it that way a long time ago" mix of both. Your ideas, guys? Thanks, Jan -- Jan Tulak jtulak@xxxxxxxxxx / jan@xxxxxxxx _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs