Hi Brian, Thanks for your explaination! kernel version:3.10.0-229.e17.x86-64 "Fixes for (hopefully) most of these problems are pending in 4.3"----Now 4.3rc1 has already fix this problem? -----邮件原件----- 发件人: Brian Foster [mailto:bfoster@xxxxxxxxxx] 发送时间: 2015年9月15日 19:14 收件人: zhaomingyue 09440 (RD) 抄送: xfs@xxxxxxxxxxx; xudonghai 11507 (RD) 主题: Re: umount hanging problem On Tue, Sep 15, 2015 at 02:59:53AM +0000, zhao.mingyue@xxxxxxx wrote: > Hi,everyone > > I have some question about XFS FILEsystem,can someone help me sovle > my problem ? > What kernel? > I built a Oracle database on a rbd block which had been created xfs > filesystem ,then I removed the rbd block ,then the file directory of > the oracle was broken and couldn’t be used,next,I recover the rbd > block > > and umount the file directory of the oracle ,what happed next is the > question which I was confused ,the process of the umount is hunging > there and couldn’t be killed,how to solve the problem ?? > It's waiting to flush everything out that was previously written to the log and sitting in the AIL list. It sounds like the filesystem shutdown because the underlying block device was torn down before the filesystem was unmounted, which is not the way to do things. ;) Recent kernels have had issues with extent freeing logging such that EFI/EFD objects are not reference counted correctly and can sit in the AIL and pin it indefinitely, particularly on races with fs shutdown. Fixes for (hopefully) most of these problems are pending in 4.3. For the time being you'll probably have to reboot to recover. Also note that this should never happen except due to some other crash, caused in this case by breaking down the block device before unmounting the fs. Brian > thanks! > > > > > part of the log is as follows: > > Sep 11 10:22:29 localhost systemd: Started Fingerprint Authentication Daemon. > Sep 11 10:22:29 localhost fprintd: Launching FprintObject Sep 11 > 10:22:29 localhost fprintd: ** Message: D-Bus service launched with > name: net.reactivated.Fprint Sep 11 10:22:29 localhost fprintd: ** > Message: entering main loop Sep 11 10:22:39 localhost kernel: XFS (dm-2): xfs_log_force: error 5 returned. > Sep 11 10:22:47 localhost systemd-logind: Removed session 1. > Sep 11 10:22:59 localhost fprintd: ** Message: No devices in use, exit > Sep 11 10:23:09 localhost kernel: XFS (dm-2): xfs_log_force: error 5 returned. > Sep 11 10:23:19 localhost systemd-logind: Removed session 136. > Sep 11 10:23:39 localhost kernel: XFS (dm-2): xfs_log_force: error 5 returned. > Sep 11 10:23:44 localhost kernel: INFO: task umount:27950 blocked for more than 120 seconds. > Sep 11 10:23:44 localhost kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > Sep 11 10:23:44 localhost kernel: umount D ffff880e1faf3680 0 27950 27775 0x00000084 > Sep 11 10:23:44 localhost kernel: ffff880dd3247d90 0000000000000086 > ffff8806f730db00 ffff880dd3247fd8 Sep 11 10:23:44 localhost kernel: > ffff880dd3247fd8 ffff880dd3247fd8 ffff8806f730db00 ffff8806e7ceae80 > Sep 11 10:23:44 localhost kernel: ffff8806cb888b48 ffff8806e7ceaec0 ffff8806e7ceaee8 ffff8806e7ceae90 Sep 11 10:23:44 localhost kernel: Call Trace: > Sep 11 10:23:44 localhost kernel: [<ffffffff81609259>] > schedule+0x29/0x70 Sep 11 10:23:44 localhost kernel: > [<ffffffffa02f1ac1>] xfs_ail_push_all_sync+0xc1/0x110 [xfs] Sep 11 > 10:23:44 localhost kernel: [<ffffffff81098230>] ? > wake_up_bit+0x30/0x30 Sep 11 10:23:44 localhost kernel: > [<ffffffffa02a1c48>] xfs_unmountfs+0x68/0x160 [xfs] Sep 11 10:23:44 > localhost kernel: [<ffffffffa02a24eb>] ? > xfs_mru_cache_destroy+0x6b/0x90 [xfs] Sep 11 10:23:44 localhost > kernel: [<ffffffffa02a3701>] xfs_fs_put_super+0x21/0x60 [xfs] Sep 11 > 10:23:44 localhost kernel: [<ffffffff811c8936>] > generic_shutdown_super+0x56/0xe0 Sep 11 10:23:44 localhost kernel: > [<ffffffff811c8c17>] kill_block_super+0x27/0x70 Sep 11 10:23:44 > localhost kernel: [<ffffffff811c8f4d>] > deactivate_locked_super+0x3d/0x60 Sep 11 10:23:44 localhost kernel: [<ffffffff811c9556>] deactivate_super+0x46/0x60 Sep 11 10:23:44 localhost kernel: [<ffffffff811e6265>] mntput_no_expire+0xc5/0x120 Sep 11 10:23:44 localhost kernel: [<ffffffff811e739f>] SyS_umount+0x9f/0x3c0 Sep 11 10:23:44 localhost kernel: [<ffffffff81613da9>] system_call_fastpath+0x16/0x1b Sep 11 10:23:57 localhost systemd-logind: Removed session 133. > Sep 11 10:24:09 localhost kernel: XFS (dm-2): xfs_log_force: error 5 returned. > Sep 11 10:24:39 localhost kernel: XFS (dm-2): xfs_log_force: error 5 returned. > Sep 11 10:25:09 localhost kernel: XFS (dm-2): xfs_log_force: error 5 returned. > Sep 11 10:25:39 localhost kernel: XFS (dm-2): xfs_log_force: error 5 returned. > Sep 11 10:25:44 localhost kernel: INFO: task umount:27950 blocked for more than 120 seconds. > Sep 11 10:25:44 localhost kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > Sep 11 10:25:44 localhost kernel: umount D ffff880e1faf3680 0 27950 27775 0x00000084 > Sep 11 10:25:44 localhost kernel: ffff880dd3247d90 0000000000000086 > ffff8806f730db00 ffff880dd3247fd8 Sep 11 10:25:44 localhost kernel: > ffff880dd3247fd8 ffff880dd3247fd8 ffff8806f730db00 ffff8806e7ceae80 > Sep 11 10:25:44 localhost kernel: ffff8806cb888b48 ffff8806e7ceaec0 ffff8806e7ceaee8 ffff8806e7ceae90 Sep 11 10:25:44 localhost kernel: Call Trace: > Sep 11 10:25:44 localhost kernel: [<ffffffff81609259>] > schedule+0x29/0x70 Sep 11 10:25:44 localhost kernel: > [<ffffffffa02f1ac1>] xfs_ail_push_all_sync+0xc1/0x110 [xfs] Sep 11 > 10:25:44 localhost kernel: [<ffffffff81098230>] ? > wake_up_bit+0x30/0x30 Sep 11 10:25:44 localhost kernel: > [<ffffffffa02a1c48>] xfs_unmountfs+0x68/0x160 [xfs] Sep 11 10:25:44 > localhost kernel: [<ffffffffa02a24eb>] ? > xfs_mru_cache_destroy+0x6b/0x90 [xfs] Sep 11 10:25:44 localhost > kernel: [<ffffffffa02a3701>] xfs_fs_put_super+0x21/0x60 [xfs] Sep 11 > 10:25:44 localhost kernel: [<ffffffff811c8936>] > generic_shutdown_super+0x56/0xe0 Sep 11 10:25:44 localhost kernel: > [<ffffffff811c8c17>] kill_block_super+0x27/0x70 Sep 11 10:25:44 > localhost kernel: [<ffffffff811c8f4d>] > deactivate_locked_super+0x3d/0x60 Sep 11 10:25:44 localhost kernel: [<ffffffff811c9556>] deactivate_super+0x46/0x60 Sep 11 10:25:44 localhost kernel: [<ffffffff811e6265>] mntput_no_expire+0xc5/0x120 Sep 11 10:25:44 localhost kernel: [<ffffffff811e739f>] SyS_umount+0x9f/0x3c0 Sep 11 10:25:44 localhost kernel: [<ffffffff81613da9>] system_call_fastpath+0x16/0x1b Sep 11 10:26:05 localhost systemd-logind: New session 138 of user root. > Sep 11 10:26:05 localhost systemd: Starting Session 138 of user root. > Sep 11 10:26:05 localhost systemd: Started Session 138 of user root. > Sep 11 10:26:09 localhost kernel: XFS (dm-2): xfs_log_force: error 5 returned. > Sep 11 10:26:29 localhost systemd-logind: New session 139 of user root. > Sep 11 10:26:29 localhost systemd: Starting Session 139 of user root. > Sep 11 10:26:29 localhost systemd: Started Session 139 of user root. > Sep 11 10:26:32 localhost systemd-logind: Removed session 139. > Sep 11 10:26:39 localhost kernel: XFS (dm-2): xfs_log_force: error 5 returned. > Sep 11 10:27:09 localhost kernel: XFS (dm-2): xfs_log_force: error 5 returned. > Sep 11 10:27:38 localhost goa[28775]: goa-daemon version 3.8.5 > starting [main.c:113, main()] Sep 11 10:27:38 localhost goa[28782]: > GoaKerberosIdentityManager: Using polling for change notification for credential cache type 'KEYRING' [goakerberosidentitymanager.c:1394, monitor_credentials_cache()] Sep 11 10:27:40 localhost kernel: XFS (dm-2): xfs_log_force: error 5 returned. > Sep 11 10:27:44 localhost kernel: INFO: task umount:27950 blocked for more than 120 seconds. > Sep 11 10:27:44 localhost kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > Sep 11 10:27:44 localhost kernel: umount D ffff880e1faf3680 0 27950 27775 0x00000084 > Sep 11 10:27:44 localhost kernel: ffff880dd3247d90 0000000000000086 > ffff8806f730db00 ffff880dd3247fd8 Sep 11 10:27:44 localhost kernel: > ffff880dd3247fd8 ffff880dd3247fd8 ffff8806f730db00 ffff8806e7ceae80 > Sep 11 10:27:44 localhost kernel: ffff8806cb888b48 ffff8806e7ceaec0 ffff8806e7ceaee8 ffff8806e7ceae90 Sep 11 10:27:44 localhost kernel: Call Trace: > Sep 11 10:27:44 localhost kernel: [<ffffffff81609259>] > schedule+0x29/0x70 Sep 11 10:27:44 localhost kernel: > [<ffffffffa02f1ac1>] xfs_ail_push_all_sync+0xc1/0x110 [xfs] Sep 11 > 10:27:44 localhost kernel: [<ffffffff81098230>] ? > wake_up_bit+0x30/0x30 Sep 11 10:27:44 localhost kernel: > [<ffffffffa02a1c48>] xfs_unmountfs+0x68/0x160 [xfs] Sep 11 10:27:44 > localhost kernel: [<ffffffffa02a24eb>] ? > xfs_mru_cache_destroy+0x6b/0x90 [xfs] Sep 11 10:27:44 localhost > kernel: [<ffffffffa02a3701>] xfs_fs_put_super+0x21/0x60 [xfs] Sep 11 > 10:27:44 localhost kernel: [<ffffffff811c8936>] > generic_shutdown_super+0x56/0xe0 Sep 11 10:27:44 localhost kernel: > [<ffffffff811c8c17>] kill_block_super+0x27/0x70 Sep 11 10:27:44 > localhost kernel: [<ffffffff811c8f4d>] > deactivate_locked_super+0x3d/0x60 Sep 11 10:27:44 localhost kernel: [<ffffffff811c9556>] deactivate_super+0x46/0x60 Sep 11 10:27:44 localhost kernel: [<ffffffff811e6265>] mntput_no_expire+0xc5/0x120 Sep 11 10:27:44 localhost kernel: [<ffffffff811e739f>] SyS_umount+0x9f/0x3c0 Sep 11 10:27:44 localhost kernel: [<ffffffff81613da9>] system_call_fastpath+0x16/0x1b Sep 11 10:28:10 localhost kernel: XFS (dm-2): xfs_log_force: error 5 returned. > Sep 11 10:28:40 localhost kernel: XFS (dm-2): xfs_log_force: error 5 returned. > Sep 11 10:29:10 localhost kernel: XFS (dm-2): xfs_log_force: error 5 returned. > Sep 11 10:29:40 localhost kernel: XFS (dm-2): xfs_log_force: error 5 returned. > Sep 11 10:29:44 localhost kernel: INFO: task umount:27950 blocked for more than 120 seconds. > Sep 11 10:29:44 localhost kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > Sep 11 10:29:44 localhost kernel: umount D ffff880e1faf3680 0 27950 27775 0x00000084 > Sep 11 10:29:44 localhost kernel: ffff880dd3247d90 0000000000000086 > ffff8806f730db00 ffff880dd3247fd8 Sep 11 10:29:44 localhost kernel: > ffff880dd3247fd8 ffff880dd3247fd8 ffff8806f730db00 ffff8806e7ceae80 > Sep 11 10:29:44 localhost kernel: ffff8806cb888b48 ffff8806e7ceaec0 ffff8806e7ceaee8 ffff8806e7ceae90 Sep 11 10:29:44 localhost kernel: Call Trace: > Sep 11 10:29:44 localhost kernel: [<ffffffff81609259>] > schedule+0x29/0x70 Sep 11 10:29:44 localhost kernel: > [<ffffffffa02f1ac1>] xfs_ail_push_all_sync+0xc1/0x110 [xfs] Sep 11 > 10:29:44 localhost kernel: [<ffffffff81098230>] ? > wake_up_bit+0x30/0x30 Sep 11 10:29:44 localhost kernel: > [<ffffffffa02a1c48>] xfs_unmountfs+0x68/0x160 [xfs] Sep 11 10:29:44 > localhost kernel: [<ffffffffa02a24eb>] ? > xfs_mru_cache_destroy+0x6b/0x90 [xfs] Sep 11 10:29:44 localhost > kernel: [<ffffffffa02a3701>] xfs_fs_put_super+0x21/0x60 [xfs] Sep 11 > 10:29:44 localhost kernel: [<ffffffff811c8936>] > generic_shutdown_super+0x56/0xe0 Sep 11 10:29:44 localhost kernel: > [<ffffffff811c8c17>] kill_block_super+0x27/0x70 Sep 11 10:29:44 > localhost kernel: [<ffffffff811c8f4d>] > deactivate_locked_super+0x3d/0x60 Sep 11 10:29:44 localhost kernel: [<ffffffff811c9556>] deactivate_super+0x46/0x60 Sep 11 10:29:44 localhost kernel: [<ffffffff811e6265>] mntput_no_expire+0xc5/0x120 Sep 11 10:29:44 localhost kernel: [<ffffffff811e739f>] SyS_umount+0x9f/0x3c0 Sep 11 10:29:44 localhost kernel: [<ffffffff81613da9>] system_call_fastpath+0x16/0x1b Sep 11 10:29:44 localhost kernel: INFO: task mount:28669 blocked for more than 120 seconds. > Sep 11 10:29:44 localhost kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > Sep 11 10:29:44 localhost kernel: mount D ffff880703cf3680 0 28669 27990 0x00000084 > Sep 11 10:29:44 localhost kernel: ffff8806ca59fc48 0000000000000082 > ffff880dea8771c0 ffff8806ca59ffd8 Sep 11 10:29:44 localhost kernel: > ffff8806ca59ffd8 ffff8806ca59ffd8 ffff880dea8771c0 ffff880dea8771c0 > Sep 11 10:29:44 localhost kernel: ffff880dec81f068 ffff880dec81f070 ffffffff00000000 ffff880dec81f078 Sep 11 10:29:44 localhost kernel: Call Trace: > Sep 11 10:29:44 localhost kernel: [<ffffffff81609259>] > schedule+0x29/0x70 Sep 11 10:29:44 localhost kernel: > [<ffffffff8160ab35>] rwsem_down_write_failed+0x115/0x220 > Sep 11 10:29:44 localhost kernel: [<ffffffff81201481>] ? > __blkdev_get+0x221/0x4d0 Sep 11 10:29:44 localhost kernel: > [<ffffffff811c8830>] ? set_bdev_super+0x40/0x40 Sep 11 10:29:44 > localhost kernel: [<ffffffff812e29a3>] > call_rwsem_down_write_failed+0x13/0x20 > Sep 11 10:29:44 localhost kernel: [<ffffffff8160864d>] ? > down_write+0x2d/0x30 Sep 11 10:29:44 localhost kernel: > [<ffffffff811c917e>] grab_super+0x2e/0xa0 Sep 11 10:29:44 localhost > kernel: [<ffffffff811c9810>] sget+0x2a0/0x3d0 Sep 11 10:29:44 > localhost kernel: [<ffffffff811c87f0>] ? ns_test_super+0x20/0x20 Sep > 11 10:29:44 localhost kernel: [<ffffffff811c9b92>] > mount_bdev+0xe2/0x1f0 Sep 11 10:29:44 localhost kernel: > [<ffffffffa02a4580>] ? xfs_parseargs+0xbf0/0xbf0 [xfs] Sep 11 10:29:44 > localhost kernel: [<ffffffff812d6230>] ? ida_get_new_above+0x230/0x2a0 > Sep 11 10:29:44 localhost kernel: [<ffffffffa02a2775>] > xfs_fs_mount+0x15/0x20 [xfs] Sep 11 10:29:44 localhost kernel: > [<ffffffff811ca599>] mount_fs+0x39/0x1b0 Sep 11 10:29:44 localhost > kernel: [<ffffffff81178420>] ? __alloc_percpu+0x10/0x20 Sep 11 > 10:29:44 localhost kernel: [<ffffffff811e5a6f>] > vfs_kern_mount+0x5f/0xf0 Sep 11 10:29:44 localhost kernel: > [<ffffffff811e7fbe>] do_mount+0x24e/0xa40 Sep 11 10:29:44 localhost kernel: [<ffffffff8117317b>] ? strndup_user+0x4b/0xf0 Sep 11 10:29:44 localhost kernel: [<ffffffff811e8846>] SyS_mount+0x96/0xf0 Sep 11 10:29:44 localhost kernel: [<ffffffff81613da9>] system_call_fastpath+0x16/0x1b Sep 11 10:30:01 localhost systemd: Starting Session 140 of user root. > Sep 11 10:30:01 localhost systemd: Started Session 140 of user root. > Sep 11 10:30:10 localhost kernel: XFS (dm-2): xfs_log_force: error 5 returned. > Sep 11 10:30:40 localhost kernel: XFS (dm-2): xfs_log_force: error 5 returned. > Sep 11 10:31:10 localhost kernel: XFS (dm-2): xfs_log_force: error 5 returned. > Sep 11 10:31:40 localhost kernel: XFS (dm-2): xfs_log_force: error 5 returned. > Sep 11 10:31:44 localhost kernel: INFO: task umount:27950 blocked for more than 120 seconds. > Sep 11 10:31:44 localhost kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > Sep 11 10:31:44 localhost kernel: umount D ffff880e1faf3680 0 27950 27775 0x00000084 > Sep 11 10:31:44 localhost kernel: ffff880dd3247d90 0000000000000086 > ffff8806f730db00 ffff880dd3247fd8 Sep 11 10:31:44 localhost kernel: > ffff880dd3247fd8 ffff880dd3247fd8 ffff8806f730db00 ffff8806e7ceae80 > Sep 11 10:31:44 localhost kernel: ffff8806cb888b48 ffff8806e7ceaec0 ffff8806e7ceaee8 ffff8806e7ceae90 Sep 11 10:31:44 localhost kernel: Call Trace: > Sep 11 10:31:44 localhost kernel: [<ffffffff81609259>] > schedule+0x29/0x70 Sep 11 10:31:44 localhost kernel: > [<ffffffffa02f1ac1>] xfs_ail_push_all_sync+0xc1/0x110 [xfs] Sep 11 > 10:31:44 localhost kernel: [<ffffffff81098230>] ? > wake_up_bit+0x30/0x30 Sep 11 10:31:44 localhost kernel: > [<ffffffffa02a1c48>] xfs_unmountfs+0x68/0x160 [xfs] Sep 11 10:31:44 > localhost kernel: [<ffffffffa02a24eb>] ? > xfs_mru_cache_destroy+0x6b/0x90 [xfs] Sep 11 10:31:44 localhost > kernel: [<ffffffffa02a3701>] xfs_fs_put_super+0x21/0x60 [xfs] Sep 11 > 10:31:44 localhost kernel: [<ffffffff811c8936>] > generic_shutdown_super+0x56/0xe0 Sep 11 10:31:44 localhost kernel: > [<ffffffff811c8c17>] kill_block_super+0x27/0x70 Sep 11 10:31:44 > localhost kernel: [<ffffffff811c8f4d>] > deactivate_locked_super+0x3d/0x60 Sep 11 10:31:44 localhost kernel: [<ffffffff811c9556>] deactivate_super+0x46/0x60 Sep 11 10:31:44 localhost kernel: [<ffffffff811e6265>] mntput_no_expire+0xc5/0x120 Sep 11 10:31:44 localhost kernel: [<ffffffff811e739f>] SyS_umount+0x9f/0x3c0 Sep 11 10:31:44 localhost kernel: [<ffffffff81613da9>] system_call_fastpath+0x16/0x1b Sep 11 10:32:10 localhost kernel: XFS (dm-2): xfs_log_force: error 5 returned. > ---------------------------------------------------------------------- > --------------------------------------------------------------- > 本邮件及其附件含有杭州华三通信技术有限公司的保密信息,仅限于发送给上面地址中列出 > 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、 > 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本 > 邮件! > This e-mail and its attachments contain confidential information from > H3C, which is intended only for the person or entity whose address is > listed above. Any use of the information contained herein in any way > (including, but not limited to, total or partial disclosure, > reproduction, or dissemination) by persons other than the intended > recipient(s) is prohibited. If you receive this e-mail in error, > please notify the sender by phone or email immediately and delete it! > _______________________________________________ > xfs mailing list > xfs@xxxxxxxxxxx > http://oss.sgi.com/mailman/listinfo/xfs _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs