Re: [PATCH 6/6] xfstests: Add mkfs input validation tests

Jan Tulak <jtulak@xxxxxxxxxx> · Mon, 18 Jul 2016 14:33:29 +0200

On Mon, Jul 18, 2016 at 1:54 PM, Jan Tulak <jtulak@xxxxxxxxxx> wrote:
> On Mon, Jul 18, 2016 at 1:47 PM, Eryu Guan <eguan@xxxxxxxxxx> wrote:
>> On Mon, Jul 18, 2016 at 01:29:47PM +0200, Jan Tulak wrote:
>>> On Mon, Jul 18, 2016 at 1:30 AM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
>>> > On Sat, Jul 16, 2016 at 05:33:58PM +0800, Eryu Guan wrote:
>>> >> On Thu, Jul 14, 2016 at 02:43:34PM +0200, Jan Tulak wrote:
>>> >> > +do_mkfs_fail -l lazy-count=1garbage $SCRATCH_DEV
>>> >> > +do_mkfs_fail -l lazy-count=2 $SCRATCH_DEV
>>> >> > +do_mkfs_fail -l lazy-count=0 -m crc=1 $SCRATCH_DEV
>>> >> > +do_mkfs_fail -l version=1 -m crc=1 $SCRATCH_DEV
>>> >>
>>> >> This test fails in my DAX testing, where SCRATCH_DEV is ramdisk. The
>>> >> mkfs itself should fail, but it passed. Log version 2 was used
>>> >> automatically, instead of prompting "V2 logs always enabled for CRC
>>> >> enabled filesytems"
>>> >>
>>> >> [root@dhcp-66-86-11 xfstests]# mkfs -t xfs -f -l version=1 -m crc=1 /dev/ram0
>>> >> meta-data=/dev/ram0              isize=512    agcount=1, agsize=4096 blks
>>> >>          =                       sectsz=4096  attr=2, projid32bit=1
>>> >>          =                       crc=1        finobt=1, sparse=0
>>> >> data     =                       bsize=4096   blocks=4096, imaxpct=25
>>> >>          =                       sunit=0      swidth=0 blks
>>> >> naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
>>> >> log      =internal log           bsize=4096   blocks=1605, version=2
>>> >>          =                       sectsz=4096  sunit=1 blks, lazy-count=1
>>> >> realtime =none                   extsz=4096   blocks=0, rtextents=0
>>> >>
>>> >> Is it a mkfs.xfs bug or the test case should handle the special case?
>>> >
>>> > Looks like it might be a side effect of using a 4k sector size. v1
>>> > logs only supported 512 byte sectors, so it's entirely possible that
>>> > the sector size is silently overriding the log version
>>> > specification. Probably should be fixed in mkfs.
>>> >
>>> >
>>>
>>> I tried to duplicate this, but in my config it didn't failed - how did
>>> you create the ramdisk?
>>
>> I think you need to test on a 4k sector size disk. I use scsi_debug to
>> simulate physical 4k sector disk to reproduce this:
>>
>> [root@dhcp-66-86-11 xfsprogs-dev]# modprobe -r scsi_debug
>> [root@dhcp-66-86-11 xfsprogs-dev]# modprobe scsi_debug dev_size_mb=128 physblk_exp=3
>> [root@dhcp-66-86-11 xfsprogs-dev]# blockdev --getbsz --getpbsz --getss /dev/sdc
>> 4096
>> 4096
>> 512
>> [root@dhcp-66-86-11 xfsprogs-dev]# mkfs -t xfs -l version=1 -m crc=1 /dev/sdc
>> meta-data=/dev/sdc               isize=512    agcount=4, agsize=8192 blks
>>          =                       sectsz=4096  attr=2, projid32bit=1
>>          =                       crc=1        finobt=1, sparse=0
>> data     =                       bsize=4096   blocks=32768, imaxpct=25
>>          =                       sunit=0      swidth=0 blks
>> naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
>> log      =internal log           bsize=4096   blocks=1605, version=2
>>          =                       sectsz=4096  sunit=1 blks, lazy-count=1
>> realtime =none                   extsz=4096   blocks=0, rtextents=0
>>
>> If you remove the "physblk_exp=3" at modprobe time, mkfs failed as
>> expected.
>>
>
> Ah, thanks. :-) Now I can reproduce it and see what happens.

And the culprit is in mkfs, some forty lines before the crc & log version check:

2026 ⇥       } else if (lsectorsize > XFS_MIN_SECTORSIZE && !lsu && !lsunit) {
2027 ⇥       ⇥       lsu = blocksize;
2028 ⇥       ⇥       sb_feat.log_version = 2;
2029 ⇥       }

The possible solutions I can think of are:

1) Make a more complicated check.
This would change just a line or two, but most likely, we would test
the same thing multiple times and added unnecessary complexity.

2) Move the crc checks into an earlier place.
The only value that can be changed in crc checks from default is
finobt, and finobt is not read nor modified between argument parsing
and the crc check. This looks like a simple and safe thing, but it
will move some ~60 lines. I tested moving the crc testing block right
behind this:
1968 ⇥       memset(&ft, 0, sizeof(ft));
1969 ⇥       get_topology(&xi, &ft, force_overwrite);
And it works. I didn't run full test suite yet, though.

3) Change the silent autoupdating of log version. The default is 2 and
if user explicitly states v1, then we should either warn or fail
entirely if it is not possible to make such fs.

4) Do nothing with mkfs and instead, update the test to check the
sector size and expect pass/fail...

But this issue boils down to the question "what is the correct order
of doing things"? Should we try to autosolve what we can at first, and
check for remaining issues after that? Or should we check for issues
with the input ASAP, even if it can be solved by updating the input to
match the physical device? Right now, it looks like "someone wrote it
that way a long time ago" mix of both.

Your ideas, guys?

Thanks,
Jan

-- 
Jan Tulak
jtulak@xxxxxxxxxx / jan@xxxxxxxx

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs