Hi Dave, thank you for your comments. The test that I am doing is unmounting the XFS, while its underlying block device returns intermittent IO errors. The block device in question is a custom DeviceMapper target. It returns -ECANCELED in this case. Should I return some other errno instead? The same exact test works alright with ext4. It's unmount finishes, system seems to continue functioning normally and kmemleak is also happy. When doing a simpler reproductoin with "error" Device-Mapper, umount gets stuck and never returns, while kernel keeps printing: XFS (dm-0): metadata I/O error: block 0x0 ("xfs_buf_iodone_callbacks") error 5 numblks 1 XFS (dm-0): metadata I/O error: block 0x0 ("xfs_buf_iodone_callbacks") error 5 numblks 1 XFS (dm-0): metadata I/O error: block 0x0 ("xfs_buf_iodone_callbacks") error 5 numblks 1 XFS (dm-0): metadata I/O error: block 0x0 ("xfs_buf_iodone_callbacks") error 5 numblks 1 XFS (dm-0): metadata I/O error: block 0x0 ("xfs_buf_iodone_callbacks") error 5 numblks 1 XFS (dm-0): metadata I/O error: block 0x0 ("xfs_buf_iodone_callbacks") error 5 numblks 1 XFS (dm-0): metadata I/O error: block 0x0 ("xfs_buf_iodone_callbacks") error 5 numblks 1 ... What is the expected behavior of umount in this case? (when there are IO errors during it). This reproduction that I do is: root@vc-00-00-1075-dev:~# cat /etc/xfs.protofile dummy : bootfilename, not used, backward compatibility 0 0 : numbers of blocks and inodes, not used, backward compatibility d--777 0 0 : set 777 perms for the root dir $ $ root@vc-00-00-1075-dev:~# mkfs.xfs -f /dev/vde -p /etc/xfs.protofile meta-data=/dev/vde isize=256 agcount=4, agsize=1310720 blks = sectsz=512 attr=2, projid32bit=0 data = bsize=4096 blocks=5242880, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 log =internal log bsize=4096 blocks=2560, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 root@vc-00-00-1075-dev:~# blockdev --getsz /dev/vde 41943040 root@vc-00-00-1075-dev:~# dmsetup create VDE --table "0 41943040 linear /dev/vde 0" root@vc-00-00-1075-dev:~# mount -o noatime,sync /dev/mapper/VDE /export/v122_nas/ copy some files to the mountpoint, then: dmsetup reload VDE --table "0 41943040 error" dmsetup resume VDE umount /export/v122_nas this never returns and /proc shows: root@vc-00-00-1075-dev:~# cat /proc/2684/stack [<ffffffffa033ac6a>] xfs_ail_push_all_sync+0x9a/0xd0 [xfs] [<ffffffffa0330123>] xfs_unmountfs+0x63/0x160 [xfs] [<ffffffffa02ee265>] xfs_fs_put_super+0x25/0x60 [xfs] [<ffffffff8118fd12>] generic_shutdown_super+0x62/0xf0 [<ffffffff8118fdd0>] kill_block_super+0x30/0x80 [<ffffffff811903dc>] deactivate_locked_super+0x3c/0x90 [<ffffffff81190d7e>] deactivate_super+0x4e/0x70 [<ffffffff811ad086>] mntput_no_expire+0x106/0x160 [<ffffffff811ae760>] sys_umount+0xa0/0xe0 [<ffffffff816ab919>] system_call_fastpath+0x16/0x1b [<ffffffffffffffff>] 0xffffffffffffffff And after some time, hung task warning shows: INFO: task kworker/2:1:39 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. kworker/2:1 D ffffffff8180cf00 0 39 2 0x00000000 ffff88007c54db38 0000000000000046 000000027d003700 ffff88007fd03fc0 ffff88007c54dfd8 ffff88007c54dfd8 ffff88007c54dfd8 0000000000013e40 ffff88007c9e9710 ffff88007c4bdc40 00000000000000b8 7fffffffffffffff Call Trace: [<ffffffff816a1b99>] schedule+0x29/0x70 [<ffffffff816a02d5>] schedule_timeout+0x1e5/0x250 [<ffffffffa02f3987>] ? kmem_zone_alloc+0x67/0xe0 [xfs] [<ffffffff816798e6>] ? kmemleak_alloc+0x26/0x50 [<ffffffff816a0f1b>] __down_common+0xa0/0xf0 [<ffffffffa032f37c>] ? xfs_getsb+0x3c/0x70 [xfs] [<ffffffff816a0fde>] __down+0x1d/0x1f [<ffffffff81084591>] down+0x41/0x50 [<ffffffffa02dcd44>] xfs_buf_lock+0x44/0x110 [xfs] [<ffffffffa032f37c>] xfs_getsb+0x3c/0x70 [xfs] [<ffffffffa033b4bc>] xfs_trans_getsb+0x4c/0x140 [xfs] [<ffffffffa032f06e>] xfs_mod_sb+0x4e/0xc0 [xfs] [<ffffffffa02e3b24>] xfs_fs_log_dummy+0x54/0x90 [xfs] [<ffffffffa0335bf8>] xfs_log_worker+0x48/0x50 [xfs] [<ffffffff81077a11>] process_one_work+0x141/0x4a0 [<ffffffff810789e8>] worker_thread+0x168/0x410 [<ffffffff81078880>] ? manage_workers+0x120/0x120 [<ffffffff8107df10>] kthread+0xc0/0xd0 [<ffffffff813a3ea4>] ? acpi_get_child+0x47/0x4d [<ffffffff813a3fb7>] ? acpi_platform_notify.part.0+0xbb/0xda [<ffffffff8107de50>] ? flush_kthread_worker+0xb0/0xb0 [<ffffffff816ab86c>] ret_from_fork+0x7c/0xb0 [<ffffffff8107de50>] ? flush_kthread_worker+0xb0/0xb0 [3] contains some of information that wiki asks for, right before doing the unmount in the original test. I have also re-ran the original test with slub_debug=FZ, but it did not show up any warnings (perhaps I am missing something on how to enable the red-zone debugging). Thanks, Alex. [3] root@vc-00-00-1075-dev:/mnt/work/alex# uname -a Linux vc-00-00-1075-dev 3.8.13-557-generic #1382000791 SMP Thu Oct 17 11:22:20 IST 2013 x86_64 x86_64 x86_64 GNU/Linux this is a manually-compiled 3.8.13 kernel, in which: CONFIG_HAVE_DEBUG_KMEMLEAK=y CONFIG_DEBUG_KMEMLEAK=y but no code changes. root@vc-00-00-1075-dev:/mnt/work/alex# xfs_repair -V xfs_repair version 3.1.7 number of CPUs: 4 root@vc-00-00-1075-dev:/mnt/work/alex# cat /proc/meminfo MemTotal: 2050940 kB MemFree: 996376 kB Buffers: 32836 kB Cached: 389208 kB SwapCached: 0 kB Active: 490232 kB Inactive: 239988 kB Active(anon): 311336 kB Inactive(anon): 1336 kB Active(file): 178896 kB Inactive(file): 238652 kB Unevictable: 4632 kB Mlocked: 4632 kB SwapTotal: 522236 kB SwapFree: 522236 kB Dirty: 96 kB Writeback: 0 kB AnonPages: 312920 kB Mapped: 46656 kB Shmem: 1456 kB Slab: 224640 kB SReclaimable: 30040 kB SUnreclaim: 194600 kB KernelStack: 3632 kB PageTables: 14584 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 1547704 kB Committed_AS: 1667900 kB VmallocTotal: 34359738367 kB VmallocUsed: 30860 kB VmallocChunk: 34359648832 kB HardwareCorrupted: 0 kB AnonHugePages: 0 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB DirectMap4k: 153588 kB DirectMap2M: 1943552 kB root@vc-00-00-1075-dev:/mnt/work/alex# cat /proc/mounts rootfs / rootfs rw 0 0 sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0 proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0 udev /dev devtmpfs rw,relatime,size=1016840k,nr_inodes=254210,mode=755 0 0 devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0 tmpfs /run tmpfs rw,nosuid,relatime,size=410188k,mode=755 0 0 /dev/disk/by-uuid/c2a6ff28-cf03-4785-9ac8-60dd4ed7d3af / ext4 rw,relatime,errors=remount-ro,data=ordered 0 0 none /sys/fs/fuse/connections fusectl rw,relatime 0 0 none /sys/kernel/debug debugfs rw,relatime 0 0 none /sys/kernel/security securityfs rw,relatime 0 0 none /run/lock tmpfs rw,nosuid,nodev,noexec,relatime,size=10256k 0 0 none /run/shm tmpfs rw,nosuid,nodev,relatime 0 0 rpc_pipefs /run/rpc_pipefs rpc_pipefs rw,relatime 0 0 172.16.0.12:/export2/work /mnt/work nfs4 rw,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=172.16.0.35,local_lock=none,addr=172.16.0.12 0 0 172.16.0.12:/export2/share /mnt/share nfs4 rw,relatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=172.16.0.35,local_lock=none,addr=172.16.0.12 0 0 nfsd /proc/fs/nfsd nfsd rw,relatime 0 0 cgroup_root /sys/fs/cgroup tmpfs rw,relatime 0 0 blkio /sys/fs/cgroup/blkio cgroup rw,relatime,perf_event,blkio,freezer,devices,cpuacct,cpu,cpuset 0 0 /dev/mapper/vpart-1 /mnt/bootdir ext4 rw,relatime,(null),debug,data=ordered 0 0 /dev/dm-20 /btrfs/pool-00000007 btrfs rw,noatime,nodatasum,nodatacow,notreelog,nospace_cache,clear_cache 0 0 /dev/dm-22 /export/v122_nas xfs rw,sync,noatime,wsync,attr2,inode64,noquota 0 0 root@vc-00-00-1075-dev:/mnt/work/alex# cat /proc/partitions major minor #blocks name 254 0 52428800 vda 254 16 52428800 vdb 254 32 20971520 vdc 254 48 20971520 vdd 254 64 20971520 vde 254 80 512000 vdf 254 96 512000 vdg 254 112 512000 vdh 254 128 512000 vdi 254 144 512000 vdj 254 160 1048576 vdk 254 176 1048576 vdl 254 192 312320 vdm 254 208 312320 vdn 8 0 5242880 sda 8 1 4717568 sda1 8 2 1 sda2 8 5 522240 sda5 11 0 1048575 sr0 253 0 140737488355327 dm-0 253 1 512000 dm-1 253 2 512000 dm-2 253 3 512000 dm-3 9 1 511988 md1 253 4 511988 dm-4 253 5 1048576 dm-5 253 7 312320 dm-7 253 6 20971520 dm-6 253 8 52428800 dm-8 253 9 20971520 dm-9 253 10 1048576 dm-10 253 11 312320 dm-11 253 12 52428800 dm-12 9 5 18872320 md5 9 2 1048564 md2 9 3 312308 md3 9 4 50329600 md4 253 13 18872320 dm-13 253 14 312308 dm-14 253 15 1048564 dm-15 253 16 50329600 dm-16 253 17 49283072 dm-17 253 18 49283072 dm-18 253 19 49283072 dm-19 253 20 49283072 dm-20 253 21 127926272 dm-21 253 22 127926272 dm-22 root@vc-00-00-1075-dev:/mnt/work/alex# xfs_info /export/v122_nas/ meta-data=/dev/dm-22 isize=256 agcount=4, agsize=7995392 blks = sectsz=512 attr=2 data = bsize=4096 blocks=31981568, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 log =internal bsize=4096 blocks=15616, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 On Fri, Nov 22, 2013 at 12:07 AM, Dave Chinner <david@xxxxxxxxxxxxx> wrote: > [cc'd the correct xfs list. PLease use xfs@xxxxxxxxxxx in future.] > > On Thu, Nov 21, 2013 at 08:04:36PM +0200, Alex Lyakas wrote: >> Greetings, >> I am using stock XFS from kernel 3.8.13, compiled with kmemleak >> enabled. I am testing a particular scenario, in which the underlying >> block device returns IO errors during XFS umount. Almost in all cases >> this results in kernel crashes in various places, and sometimes >> kmemleak complains, and sometimes CPU soft lockup happens. It always >> happens after XFS messages like: > > What testing are you doing? > > http://xfs.org/index.php/XFS_FAQ#Q:_What_information_should_I_include_when_reporting_a_problem.3F > >> kernel: [ 600.190509] XFS (dm-22): metadata I/O error: block >> 0x7600030 ("xlog_iodone") error 125 numblks 64 >> kernel: [ 600.192267] XFS (dm-22): xfs_do_force_shutdown(0x2) called >> from line 1115 of file >> /mnt/compile/linux-stable/source/fs/xfs/xfs_log.c. Return address = >> 0xffffffffa05cffa1 >> kernel: [ 600.192319] XFS (dm-22): Log I/O Error Detected. Shutting >> down filesystem >> kernel: [ 600.192392] XFS (dm-22): Unable to update superblock >> counters. Freespace may not be correct on next mount. >> kernel: [ 600.192398] XFS (dm-22): xfs_log_force: error 5 returned. >> kernel: [ 600.193687] XFS (声.Z): Please umount the filesystem and >> rectify the problem(s) >> >> you can see here the garbage that XFS prints instead of the block device name. >> In [1] and [2] I am attaching more kernel log from two such crashes. > > So, something is corrupting memory and stamping all over the XFS > structures. > > What's error 125? > > #define ECANCELED 125 /* Operation Canceled */ > > I can't find an driver that actually returns that error to > filesystems, which.... > > >> kernel: [ 600.227881] Modules linked in: xfs raid1 xfrm_user >> xfrm4_tunnel tunnel4 ipcomp xfrm_ipcomp esp4 ah4 scst_vdisk(O) >> iscsi_scst(O) scst(O) dm_zcache(O) dm_btrfs(O) btrfs(O) libcrc32c >> dm_iostat(O) > > .... given you have a bunch of out of tree modules loaded (and some which > are experiemental) suggests that you have a problem with your > storage... > > So, something is corrupting memory across a large number of > subsystems, and the trigger is some custom code to run error > injection. Can you reproduce the problem with something like > dm-faulty? > > Cheers, > > Dave. > -- > Dave Chinner > david@xxxxxxxxxxxxx _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs