On Wed, Nov 09, 2016 at 04:09:24PM +0800, Eryu Guan wrote: > On Fri, Nov 04, 2016 at 05:18:00PM -0700, Darrick J. Wong wrote: > > Previously, our XFS fuzzing efforts were limited to using the xfs_db > > blocktrash command to scribble garbage all over a block. This is > > pretty easy to discover; it would be far more interesting if we could > > fuzz individual fields looking for unhandled corner cases. Since we > > now have an online scrub tool, use it to check for our targeted > > corruptions prior to the usual steps of writing to the FS, taking it > > offline, repairing, and re-checking. > > > > These tests use the new xfs_db 'fuzz' command to test corner case > > handling of every field. The 'print' command tells us which fields > > are available, and the fuzz command can write zeroes or ones to the > > field; set the high, middle, or low bit; add or subtract numbers; or > > randomize the field. We loop through all fields and all fuzz verbs to > > see if we can trip up the kernel. > > > > Signed-off-by: Darrick J. Wong <darrick.wong@xxxxxxxxxx> > > The first test gave me a kernel crash :) xfs/1300 crashed your kernel > djwong-devel branch. I appended the console log at the end of this mail > if you have interest to see it. > > And another xfs/1300 run gave me this failure message: > > +/mnt/testarea/scratch: Kernel lacks GETFSMAP; scrub will be less efficient. (xfs.c line 661) > +/mnt/testarea/scratch: Kernel cannot help scrub metadata; scrub will be incomplete. (xfs.c line 661) > +/mnt/testarea/scratch: Kernel cannot help scrub inodes; scrub will be incomplete. (xfs.c line 661) > +/mnt/testarea/scratch: Kernel cannot help scrub extent map; scrub will be less efficient. (xfs.c line 661) > > Is this known issue or something should be filtered out in the test? That's strange, the djwong-devel branch should have getfsmap & scrub in it... ...are you running the djwong-devel kernel and xfsprogs code? The scrub ioctl structure has shifted some over the past few months, though GETFSMAP hasn't changed in ages. Wait, "another xfs/1300 run" ... so after the first crash, did you go back to a vanilla kernel without all my crazypatches? :) > And ext4/1300 generated large .out.bad file (51M), containing something > like: > > +/mnt/testarea/scratch/test/68/S_IFREG.FMT_ETREE: extent (1101381632/2469888/4096) ends past end of filesystem at 31457280. (generic.c line 272) > +/mnt/testarea/scratch/test/68/S_IFREG.FMT_ETREE: extent (1101389824/2478080/4096) starts past end of filesystem at 31457280. (generic.c line 264) > +/mnt/testarea/scratch/test/68/S_IFREG.FMT_ETREE: extent (1101389824/2478080/4096) ends past end of filesystem at 31457280. (generic.c line 272) > +/mnt/testarea/scratch/test/68/S_IFREG.FMT_ETREE: extent (1101398016/2486272/4096) starts past end of filesystem at 31457280. (generic.c line 264) > +/mnt/testarea/scratch/test/68/S_IFREG.FMT_ETREE: extent (1101398016/2486272/4096) ends past end of filesystem at 31457280. (generic.c line 272) > +/mnt/testarea/scratch/test/68/S_IFREG.FMT_ETREE: extent (1101406208/2494464/4096) starts past end of filesystem at 31457280. (generic.c line 264) > +/mnt/testarea/scratch/test/68/S_IFREG.FMT_ETREE: extent (1101406208/2494464/4096) ends past end of filesystem at 31457280. (generic.c line 272) > +/mnt/testarea/scratch/test/68/S_IFREG.FMT_ETREE: extent (1101414400/2502656/4096) starts past end of filesystem at 31457280. (generic.c line 264) > +/mnt/testarea/scratch/test/68/S_IFREG.FMT_ETREE: extent (1101414400/2502656/4096) ends past end of filesystem at 31457280. (generic.c line 272) > > Seems like scrub found something wrong (real problems) and became very > noisy? Hmm that's even stranger. I'll try to reproduce tomorrow. > More comments inline below. > > > --- > > common/fuzzy | 49 ++++++++++++++++++++++++++++++------ > > common/populate | 10 ++++--- > > common/rc | 15 +++++++++++ > > tests/ext4/1300 | 60 ++++++++++++++++++++++++++++++++++++++++++++ > [snip] > > > > diff --git a/common/fuzzy b/common/fuzzy > > index 6af47f1..dbff744 100644 > > --- a/common/fuzzy > > +++ b/common/fuzzy > > @@ -85,32 +85,47 @@ _scratch_scrub() { > > # Filter the xfs_db print command's field debug information > > # into field name and type. > > __filter_xfs_db_print_fields() { > > + filter="$1" > > + if [ -z "${filter}" ] || [ "${filter}" = "nofilter" ]; then > > + filter='^' > > + fi > > grep ' = ' | while read key equals value; do > > - fuzzkey="$(echo "${key}" | sed -e 's/\([a-zA-Z0-9_]*\)\[\([0-9]*\)-[0-9]*\]/\1[\2]/g')" > > - if [[ "${value}" == "["* ]]; then > > + # Filter out any keys with an array index >= 10, and > > + # collapse any array range ("[1-195]") to the first item. > > + fuzzkey="$(echo "${key}" | sed -e '/\([a-z]*\)\[\([0-9][0-9]\+\)\].*/d' -e 's/\([a-zA-Z0-9_]*\)\[\([0-9]*\)-[0-9]*\]/\1[\2]/g')" > > + if [ -z "${fuzzkey}" ]; then > > + continue > > + elif [[ "${value}" == "["* ]]; then > > echo "${value}" | sed -e 's/^.//g' -e 's/.$//g' -e 's/,/\n/g' | while read subfield; do > > echo "${fuzzkey}.${subfield}" > > done > > else > > echo "${fuzzkey}" > > fi > > - done > > + done | egrep "${filter}" > > } > > > > # Navigate to some part of the filesystem and print the field info. > > +# The first argument is an egrep filter for the fields > > +# The rest of the arguments are xfs_db commands to locate the metadata. > > _scratch_xfs_list_metadata_fields() { > > + filter="$1" > > + shift > > if [ -n "${SCRATCH_XFS_LIST_METADATA_FIELDS}" ]; then > > - echo "${SCRATCH_XFS_LIST_METADATA_FIELDS}" | sed -e 's/ /\n/g' > > + echo "${SCRATCH_XFS_LIST_METADATA_FIELDS}" | \ > > + sed -e 's/ /\n/g' | __filter_xfs_db_print_fields "${filter}" > > return; > > fi > > > > (for arg in "$@"; do > > echo "${arg}" > > done > > - echo "print") | _scratch_xfs_db | __filter_xfs_db_print_fields > > + echo "print") | _scratch_xfs_db | __filter_xfs_db_print_fields "${filter}" > > } > > > > # Get a metadata field > > +# The first arg is the field name > > +# The rest of the arguments are xfs_db commands to find the metadata. > > _scratch_xfs_get_metadata_field() { > > key="$1" > > shift > > @@ -124,6 +139,9 @@ _scratch_xfs_get_metadata_field() { > > } > > > > # Set a metadata field > > +# The first arg is the field name > > +# The second arg is the new value > > +# The rest of the arguments are xfs_db commands to find the metadata. > > _scratch_xfs_set_metadata_field() { > > key="$1" > > value="$2" > > @@ -136,6 +154,9 @@ _scratch_xfs_set_metadata_field() { > > } > > > > # Fuzz a metadata field > > +# The first arg is the field name > > +# The second arg is the xfs_db fuzz verb > > +# The rest of the arguments are xfs_db commands to find the metadata. > > _scratch_xfs_fuzz_metadata_field() { > > key="$1" > > value="$2" > > @@ -263,12 +284,24 @@ _scratch_xfs_list_fuzz_verbs() { > > sed -e 's/[,.]//g' -e 's/Verbs: //g' -e 's/ /\n/g' > > } > > > > -# Fuzz the fields of some piece of metadata > > -_scratch_xfs_fuzz_fields() { > > - _scratch_xfs_list_metadata_fields "$@" | while read field; do > > +# Fuzz some of the fields of some piece of metadata > > +# The first argument is an egrep filter > > +# The rest of the arguments are xfs_db commands to locate the metadata. > > +_scratch_xfs_fuzz_some_fields() { > > + filter="$1" > > + shift > > + echo "Fields we propose to fuzz: $@" > > + _scratch_xfs_list_metadata_fields "${filter}" "$@" > > + _scratch_xfs_list_metadata_fields "${filter}" "$@" | while read field; do > > _scratch_xfs_list_fuzz_verbs | while read fuzzverb; do > > __scratch_xfs_fuzz_mdrestore > > __scratch_xfs_fuzz_field_test "${field}" "${fuzzverb}" "$@" > > done > > done > > } > > + > > +# Fuzz all of the fields of some piece of metadata > > +# All arguments are xfs_db commands to locate the metadata. > > +_scratch_xfs_fuzz_fields() { > > + _scratch_xfs_fuzz_some_fields '' "$@" > > +} > > I think all the fuzz update here should be folded to patch 7/9. (I'll look at the patch fixes in a separate reply tomorrow.) > > diff --git a/common/populate b/common/populate > > index 15d68fc..7d103f0 100644 > > --- a/common/populate > > +++ b/common/populate > > @@ -180,13 +180,13 @@ _scratch_xfs_populate() { > > # FMT_EXTENTS with a remote less-than-a-block value > > echo "+ attr extents with a remote less-than-a-block value" > > touch "${SCRATCH_MNT}/ATTR.FMT_EXTENTS_REMOTE3K" > > - $XFS_IO_PROG -f -c "pwrite -S 0x43 0 3k" "${SCRATCH_MNT}/attrvalfile" > /dev/null > > + $XFS_IO_PROG -f -c "pwrite -S 0x43 0 $((blksz - 300))" "${SCRATCH_MNT}/attrvalfile" > /dev/null > > attr -q -s user.remotebtreeattrname "${SCRATCH_MNT}/ATTR.FMT_EXTENTS_REMOTE3K" < "${SCRATCH_MNT}/attrvalfile" > > > > # FMT_EXTENTS with a remote block-size value > > echo "+ attr extents with a remote one-block value" > > touch "${SCRATCH_MNT}/ATTR.FMT_EXTENTS_REMOTE4K" > > - $XFS_IO_PROG -f -c "pwrite -S 0x44 0 4k" "${SCRATCH_MNT}/attrvalfile" > /dev/null > > + $XFS_IO_PROG -f -c "pwrite -S 0x44 0 ${blksz}" "${SCRATCH_MNT}/attrvalfile" > /dev/null > > attr -q -s user.remotebtreeattrname "${SCRATCH_MNT}/ATTR.FMT_EXTENTS_REMOTE4K" < "${SCRATCH_MNT}/attrvalfile" > > rm -rf "${SCRATCH_MNT}/attrvalfile" > > > > @@ -482,8 +482,8 @@ _scratch_xfs_populate_check() { > > __populate_check_xfs_aformat "${btree_attr}" "btree" > > __populate_check_xfs_agbtree_height "bno" > > __populate_check_xfs_agbtree_height "cnt" > > - test -n $is_rmapbt && __populate_check_xfs_agbtree_height "rmap" > > - test -n $is_reflink && __populate_check_xfs_agbtree_height "refcnt" > > + test $is_rmapbt -ne 0 && __populate_check_xfs_agbtree_height "rmap" > > + test $is_reflink -ne 0 && __populate_check_xfs_agbtree_height "refcnt" > > } > > And these folded to patch 1/9? > > > > > # Check data fork format of ext4 file > > @@ -609,7 +609,7 @@ _scratch_populate_cached() { > > rm -rf "$(find "${POPULATE_METADUMP}" -mtime +2 2>/dev/null)" > > > > # Throw away cached image if it doesn't match our spec. > > - meta_descr="FSTYP ${FSTYP} MKFS_OPTIONS ${MKFS_OPTIONS} ARGS $@" > > + meta_descr="FSTYP ${FSTYP} MKFS_OPTIONS $(_scratch_mkfs_options) ARGS $@" > > cmp -s "${POPULATE_METADUMP_DESCR}" <(echo "${meta_descr}") || rm -rf "${POPULATE_METADUMP}" > > > > # Do we have a cached image? > > This to patch 6/9? > > Because we usually don't introduce something in patch 1 and fix them in > patch 2, I think :) Generally, yes. :) > > diff --git a/common/rc b/common/rc > > index d904582..ec1f5de 100644 > > --- a/common/rc > > +++ b/common/rc > > @@ -1870,6 +1870,21 @@ _require_xfs_finobt() > > _scratch_unmount > > } > > > > +# Do we have a fre > > +_require_scratch_finobt() > > +{ > > + _require_scratch > > + > > + if [ $FSTYP != "xfs" ]; then > > + _notrun "finobt not supported by scratch filesystem type: $FSTYP" > > + return > > + fi > > + _scratch_mkfs > /dev/null > > + _scratch_mount > > + xfs_info $SCRATCH_MNT | grep -q 'finobt=1' || _notrun "finobt not supported by scratch filesystem type: $FSTYP" > > + _scratch_unmount > > +} > > + > > # this test requires xfs sysfs attribute support > > # > > _require_xfs_sysfs() > > diff --git a/tests/ext4/1300 b/tests/ext4/1300 > > new file mode 100755 > > index 0000000..3f8135e > > --- /dev/null > > +++ b/tests/ext4/1300 > > [all the tests look fine to me, snip] > > > --- a/tests/xfs/group > > +++ b/tests/xfs/group > > @@ -333,3 +333,34 @@ > > 345 auto quick clone > > 346 auto quick clone > > 347 auto quick clone > > +1300 dangerous_fuzzers scrub > > ext4/1300 is "auto quick scrub", I think xfs/1300 should be in auto > group too? > > Thanks, > Eryu > > P.S. console log of xfs/1300 crash > > [165877.766244] BUG: unable to handle kernel NULL pointer dereference at 0000000000000038 > [165877.774197] IP: [<ffffffffa0680c13>] xfs_scrub_get_inode+0xc3/0x1c0 [xfs] > [165877.781162] PGD 179c1b067 [165877.783784] PUD 14d994067 Ohhh... I suspect this happens when xfs_scrub_op_ok tries to use sc->tp after some error happens, which we can't do because this function is used in the process of initializing sc. Gonna go cry in my beer for a day or two or something, --D > PMD 0 [165877.787130] > [165877.788722] Oops: 0000 [#1] SMP > [165877.791951] Modules linked in: dm_delay dm_zero btrfs xor raid6_pq dm_thin_pool dm_persistent_data dm_bio_prison dm_snapshot dm_bufio loop dm_flakey xfs libcrc32c binfmt_misc ip6t_rpfilter ipt_REJECT nf_reject_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack nf_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_mangle ip6table_security ip6table_raw iptable_mangle iptable_security iptable_raw ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter coretemp kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw iTCO_wdt gf128mul glue_helper ipmi_devintf cdc_ether iTCO_vendor_support ablk_helper cryptd usbnet i2c_i801 lpc_ich mii pcspkr i2c_smbus sg i7core_edac mfd_core ipmi_si edac_core ipmi_msghandler shpchp ioatdma dca acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables ext4 jbd2 mbcache sr_mod sd_mod cdrom mgag200 i2c_algo_bit drm_kms_helpe! > r syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ata_generic pata_acpi drm ata_piix libata crc32c_intel megaraid_sas serio_raw i2c_core bnx2 dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_debug] > [165877.898708] CPU: 5 PID: 26242 Comm: xfs_scrub Tainted: G W 4.9.0-rc3.djwong+ #16 > [165877.907308] Hardware name: IBM System x3550 M3 -[7944OEJ]-/90Y4784 , BIOS -[D6E150CUS-1.11]- 02/08/2011 > [165877.917124] task: ffff88017a286a40 task.stack: ffffc9000cca0000 > [165877.923126] RIP: 0010:[<ffffffffa0680c13>] [<ffffffffa0680c13>] xfs_scrub_get_inode+0xc3/0x1c0 [xfs] > [165877.932503] RSP: 0018:ffffc9000cca3ad0 EFLAGS: 00010246 > [165877.937898] RAX: 0000000000000000 RBX: fffffffffffffffe RCX: 0000000000000017 > [165877.945111] RDX: 0000000000000000 RSI: ffffc9000cca3ce8 RDI: 0000000000001123 > [165877.952328] RBP: ffffc9000cca3b20 R08: 0000000000000003 R09: 0000000000000014 > [165877.959544] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880278f59680 > [165877.966759] R13: ffffc9000cca3ba8 R14: ffffc9000cca3ce8 R15: 0000000000000000 > [165877.973976] FS: 00007f3d099f9700(0000) GS:ffff88017bb00000(0000) knlGS:0000000000000000 > [165877.982143] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [165877.987969] CR2: 0000000000000038 CR3: 000000016ad2d000 CR4: 00000000000006e0 > [165877.995185] Stack: > [165877.997286] ffff880278f59680 0000000000000000 ffff880220cc5c00 ffff880265190780 > [165878.004840] ffff880101a67800 ffffc9000cca3ba8 ffff880278f59680 ffff88025a9e6000 > [165878.012396] ffffc9000cca3ce8 0000000000000000 ffffc9000cca3b58 ffffffffa0681a61 > [165878.019950] Call Trace: > [165878.022551] [<ffffffffa0681a61>] __xfs_scrub_setup_inode.isra.63+0x81/0x280 [xfs] > [165878.030236] [<ffffffffa0681c70>] xfs_scrub_setup_inode+0x10/0x20 [xfs] > [165878.036963] [<ffffffffa068f57f>] xfs_scrub_metadata+0x2ff/0x450 [xfs] > [165878.043607] [<ffffffffa066aa1d>] xfs_ioc_scrub_metadata+0x4d/0x80 [xfs] > [165878.050424] [<ffffffffa066d029>] xfs_file_ioctl+0x9c9/0xb10 [xfs] > [165878.056689] [<ffffffff8110692f>] ? get_futex_key+0x1df/0x360 > [165878.062516] [<ffffffff81106b31>] ? futex_wake+0x81/0x150 > [165878.068003] [<ffffffff812189c6>] do_vfs_ioctl+0x96/0x5b0 > [165878.073482] [<ffffffff81218f59>] SyS_ioctl+0x79/0x90 > [165878.078621] [<ffffffff81003997>] do_syscall_64+0x67/0x180 > [165878.084191] [<ffffffff816a6c2b>] entry_SYSCALL64_slow_path+0x25/0x25 > [165878.090714] Code: 8b 14 24 49 8b 75 00 85 c0 41 89 c7 44 0f b6 82 93 00 00 00 44 0f b6 8a 94 00 00 00 0f b6 8a 53 02 00 00 49 8b 55 08 48 8b 7e 08 <4c> 8b 62 38 75 38 48 8b 7d d0 8b 46 10 39 87 98 03 00 00 74 21 > [165878.110930] RIP [<ffffffffa0680c13>] xfs_scrub_get_inode+0xc3/0x1c0 [xfs] > [165878.117938] RSP <ffffc9000cca3ad0> > [165878.121512] CR2: 0000000000000038 > [165878.129219] ---[ end trace d23e56c58f53ccb9 ]--- > -- > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe fstests" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html