xfs warnings on kernel 3.2.15

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I've noticed one of our postgresql servers having a high load, so I looked into it.
Kernel = 3.2.15, disk space is free:

# df
/dev/sda2              6281216    457888   5823328   8% /disks/work
/dev/sdc1             52402180  10135176  42267004  20% /disks/pg-db

And I got these warnings since Jul 12 until today:

Jul 12 00:31:41 db24 kernel: XFS (sda2): xlog_space_left: head behind tail
Jul 12 00:31:41 db24 kernel:  tail_cycle = 4, tail_bytes = 248320
Jul 12 00:31:41 db24 kernel:  GH   cycle = 4, GH   bytes = 248312
[snip]
Aug  8 03:45:48 db24 kernel: XFS (sdc1): xlog_space_left: head behind tail
Aug  8 03:45:48 db24 kernel:  tail_cycle = 607, tail_bytes = 21702144
Aug  8 03:45:48 db24 kernel:  GH   cycle = 607, GH   bytes = 21701856

And this morning it changed into warnings:

Aug  9 10:34:57 db24 kernel: INFO: task vmtoolsd:1426 blocked for more than 120 seconds.
Aug  9 10:34:57 db24 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug  9 10:34:57 db24 kernel: ffff880059e0fc78 0000000000000082 ffff880059af3ba0 0000000000000001
Aug  9 10:34:57 db24 kernel: ffff880059e0e010 ffff880058ea5cc0 0000000000010940 ffff880059e0ffd8
Aug  9 10:34:57 db24 kernel: ffff880059e0ffd8 0000000000010940 ffff880056264f80 ffff880058ea5cc0
Aug  9 10:34:57 db24 kernel: Call Trace:
Aug  9 10:34:57 db24 kernel: [<ffffffff8136e926>] schedule+0x55/0x57
Aug  9 10:34:57 db24 kernel: [<ffffffff8136fb1e>] rwsem_down_failed_common+0xe4/0x116
Aug  9 10:34:57 db24 kernel: [<ffffffff8136fb63>] rwsem_down_write_failed+0x13/0x15
Aug  9 10:34:57 db24 kernel: [<ffffffff811c4503>] call_rwsem_down_write_failed+0x13/0x20
Aug  9 10:34:57 db24 kernel: [<ffffffff8136f35c>] ? down_write+0x20/0x22
Aug  9 10:34:57 db24 kernel: [<ffffffff810d1e44>] thaw_super+0x1c/0xae
Aug  9 10:34:57 db24 kernel: [<ffffffff810f6b5e>] thaw_bdev+0x45/0x74
Aug  9 10:34:57 db24 kernel: [<ffffffffa002019c>] VmSyncThawDevices+0x56/0x9e [vmsync]
Aug  9 10:34:57 db24 kernel: [<ffffffffa0020542>] VmSyncUnlockedIoctl+0x1b4/0x208 [vmsync]
Aug  9 10:34:57 db24 kernel: [<ffffffff81115dd1>] proc_reg_unlocked_ioctl+0x82/0xa1
Aug  9 10:34:57 db24 kernel: [<ffffffff8102c05d>] ? pick_next_task_fair+0xfc/0x10c
Aug  9 10:34:57 db24 kernel: [<ffffffffa002038e>] ? VmSyncAddPath+0xff/0xff [vmsync]
Aug  9 10:34:57 db24 kernel: [<ffffffff810dcf9f>] do_vfs_ioctl+0x3f6/0x47b
Aug  9 10:34:57 db24 kernel: [<ffffffff810dd066>] sys_ioctl+0x42/0x65
Aug  9 10:34:57 db24 kernel: [<ffffffff813705d2>] system_call_fastpath+0x16/0x1b
Aug  9 10:34:57 db24 kernel: INFO: task postmaster:3112 blocked for more than 120 seconds.
Aug  9 10:34:57 db24 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug  9 10:34:57 db24 kernel: ffff8800563f5d18 0000000000000082 0000000000000000 0000000000000000
Aug  9 10:34:57 db24 kernel: ffff8800563f4010 ffff8800590d93e0 0000000000010940 ffff8800563f5fd8
Aug  9 10:34:57 db24 kernel: ffff8800563f5fd8 0000000000010940 ffff88005ab9c240 ffff8800590d93e0
Aug  9 10:34:57 db24 kernel: Call Trace:
Aug  9 10:34:57 db24 kernel: [<ffffffff8136e926>] schedule+0x55/0x57
Aug  9 10:34:57 db24 kernel: [<ffffffff8114ed8b>] xfs_file_aio_write+0x118/0x224
Aug  9 10:34:57 db24 kernel: [<ffffffff8104c59a>] ? wake_up_bit+0x25/0x25
Aug  9 10:34:57 db24 kernel: [<ffffffff810cf526>] do_sync_write+0xc9/0x106
Aug  9 10:34:57 db24 kernel: [<ffffffff8100875a>] ? read_tsc+0x9/0x1b
Aug  9 10:34:57 db24 kernel: [<ffffffff810528b0>] ? timekeeping_get_ns+0x15/0x34
Aug  9 10:34:57 db24 kernel: [<ffffffff81053370>] ? ktime_get_ts+0x80/0x89
Aug  9 10:34:57 db24 kernel: [<ffffffff810dda04>] ? poll_select_copy_remaining+0xc6/0xea
Aug  9 10:34:57 db24 kernel: [<ffffffff810cfe57>] vfs_write+0xa9/0x105
Aug  9 10:34:57 db24 kernel: [<ffffffff810cff69>] sys_write+0x45/0x69
Aug  9 10:34:57 db24 kernel: [<ffffffff813705d2>] system_call_fastpath+0x16/0x1b
Aug  9 10:34:57 db24 kernel: INFO: task postmaster:3115 blocked for more than 120 seconds.
Aug  9 10:34:57 db24 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug  9 10:34:57 db24 kernel: ffff8800563edb38 0000000000000082 0000000000000000 ffff880000000000
Aug  9 10:34:57 db24 kernel: ffff8800563ec010 ffff8800590dcf80 0000000000010940 ffff8800563edfd8
Aug  9 10:34:57 db24 kernel: ffff8800563edfd8 0000000000010940 ffff88005ab9c240 ffff8800590dcf80
Aug  9 10:34:57 db24 kernel: Call Trace:
Aug  9 10:34:57 db24 kernel: [<ffffffff8136e926>] schedule+0x55/0x57
Aug  9 10:34:57 db24 kernel: [<ffffffff8118a64a>] xfs_trans_alloc+0x79/0xa7
Aug  9 10:34:57 db24 kernel: [<ffffffff8104c59a>] ? wake_up_bit+0x25/0x25
Aug  9 10:34:57 db24 kernel: [<ffffffff8115b99a>] xfs_create+0x18c/0x4c7
Aug  9 10:34:57 db24 kernel: [<ffffffff81150b56>] ? xfs_ilock+0x59/0x5e
Aug  9 10:34:57 db24 kernel: [<ffffffff810df065>] ? spin_lock+0x9/0xb
Aug  9 10:34:57 db24 kernel: [<ffffffff810df877>] ? _d_rehash+0x1f/0x22
Aug  9 10:34:57 db24 kernel: [<ffffffff811541d1>] xfs_vn_mknod+0xca/0x15e
Aug  9 10:34:57 db24 kernel: [<ffffffff81154280>] xfs_vn_create+0xb/0xd
Aug  9 10:34:57 db24 kernel: [<ffffffff810da7ad>] vfs_create+0x4e/0x6e
Aug  9 10:34:57 db24 kernel: [<ffffffff810db68c>] do_last+0x201/0x502
Aug  9 10:34:57 db24 kernel: [<ffffffff810dba61>] path_openat+0xd4/0x31a
Aug  9 10:34:57 db24 kernel: [<ffffffff810c13f4>] ? kmem_cache_free+0x15/0x6e
Aug  9 10:34:57 db24 kernel: [<ffffffff810dbd96>] do_filp_open+0x38/0x86
Aug  9 10:34:57 db24 kernel: [<ffffffff810e507a>] ? alloc_fd+0x116/0x129
Aug  9 10:34:57 db24 kernel: [<ffffffff810ce147>] do_sys_open+0x6e/0x107
Aug  9 10:34:57 db24 kernel: [<ffffffff810ce209>] sys_open+0x1b/0x1d
Aug  9 10:34:57 db24 kernel: [<ffffffff813705d2>] system_call_fastpath+0x16/0x1b
Aug  9 10:34:57 db24 kernel: INFO: task flush-259:26214:19297 blocked for more than 120 seconds.
Aug  9 10:34:57 db24 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug  9 10:34:57 db24 kernel: ffff880052039b10 0000000000000046 ffff880052039aa0 ffff880047536078
Aug  9 10:34:57 db24 kernel: ffff880052038010 ffff880056263500 0000000000010940 ffff880052039fd8
Aug  9 10:34:57 db24 kernel: ffff880052039fd8 0000000000010940 ffff880059620000 ffff880056263500
Aug  9 10:34:57 db24 kernel: Call Trace:
Aug  9 10:34:57 db24 kernel: [<ffffffff8104c605>] ? spin_unlock_irqrestore+0x9/0xb
Aug  9 10:34:57 db24 kernel: [<ffffffff8136e926>] schedule+0x55/0x57
Aug  9 10:34:57 db24 kernel: [<ffffffff8118a64a>] xfs_trans_alloc+0x79/0xa7
Aug  9 10:34:57 db24 kernel: [<ffffffff8104c59a>] ? wake_up_bit+0x25/0x25
Aug  9 10:34:57 db24 kernel: [<ffffffff8109180a>] ? generic_writepages+0x4b/0x57
Aug  9 10:34:57 db24 kernel: [<ffffffff8115906b>] xfs_log_dirty_inode+0x2b/0x9f
Aug  9 10:34:57 db24 kernel: [<ffffffff8115716d>] xfs_fs_write_inode+0x47/0x107
Aug  9 10:34:57 db24 kernel: [<ffffffff810ec01b>] writeback_single_inode+0x1b8/0x2ef
Aug  9 10:34:57 db24 kernel: [<ffffffff810ec705>] writeback_sb_inodes+0x168/0x201
Aug  9 10:34:57 db24 kernel: [<ffffffff810eccfe>] __writeback_inodes_wb+0x6d/0xab
Aug  9 10:34:57 db24 kernel: [<ffffffff810ecea6>] wb_writeback+0xf1/0x18b
Aug  9 10:34:57 db24 kernel: [<ffffffff8101f406>] ? default_spin_lock_flags+0x9/0xf
Aug  9 10:34:57 db24 kernel: [<ffffffff8136fbc4>] ? _raw_spin_lock_irqsave+0x22/0x2b
Aug  9 10:34:57 db24 kernel: [<ffffffff810ed07f>] wb_do_writeback+0x13f/0x1a0
Aug  9 10:34:57 db24 kernel: [<ffffffff8103fdbe>] ? del_timer+0x81/0x81
Aug  9 10:34:57 db24 kernel: [<ffffffff810ed170>] bdi_writeback_thread+0x90/0x164
Aug  9 10:34:57 db24 kernel: [<ffffffff810ed0e0>] ? wb_do_writeback+0x1a0/0x1a0
Aug  9 10:34:57 db24 kernel: [<ffffffff8104c17c>] kthread+0x7d/0x85
Aug  9 10:34:57 db24 kernel: [<ffffffff81371eb4>] kernel_thread_helper+0x4/0x10
Aug  9 10:34:57 db24 kernel: [<ffffffff8104c0ff>] ? kthread_worker_fn+0x153/0x153
Aug  9 10:34:57 db24 kernel: [<ffffffff81371eb0>] ? gs_change+0x13/0x13
Aug  9 10:34:57 db24 kernel: INFO: task postmaster:26077 blocked for more than 120 seconds.
Aug  9 10:34:57 db24 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug  9 10:34:57 db24 kernel: ffff8800257e7d18 0000000000000086 ffffea00014d6680 0000000000000000
Aug  9 10:34:57 db24 kernel: ffff8800257e6010 ffff880056261a80 0000000000010940 ffff8800257e7fd8
Aug  9 10:34:57 db24 kernel: ffff8800257e7fd8 0000000000010940 ffffffff8160d020 ffff880056261a80
Aug  9 10:34:57 db24 kernel: Call Trace:
Aug  9 10:34:57 db24 kernel: [<ffffffff8136e926>] schedule+0x55/0x57
Aug  9 10:34:57 db24 kernel: [<ffffffff8114ed8b>] xfs_file_aio_write+0x118/0x224
Aug  9 10:34:57 db24 kernel: [<ffffffff8104c59a>] ? wake_up_bit+0x25/0x25
Aug  9 10:34:57 db24 kernel: [<ffffffff810cf526>] do_sync_write+0xc9/0x106
Aug  9 10:34:57 db24 kernel: [<ffffffff810cfe57>] vfs_write+0xa9/0x105
Aug  9 10:34:57 db24 kernel: [<ffffffff810cff69>] sys_write+0x45/0x69
Aug  9 10:34:57 db24 kernel: [<ffffffff813705d2>] system_call_fastpath+0x16/0x1b

Aug  9 10:36:57 db24 kernel: INFO: task vmtoolsd:1426 blocked for more than 120 seconds.
Aug  9 10:36:57 db24 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug  9 10:36:57 db24 kernel: ffff880059e0fc78 0000000000000082 ffff880059af3ba0 0000000000000001
Aug  9 10:36:57 db24 kernel: ffff880059e0e010 ffff880058ea5cc0 0000000000010940 ffff880059e0ffd8
Aug  9 10:36:57 db24 kernel: ffff880059e0ffd8 0000000000010940 ffff880056264f80 ffff880058ea5cc0
Aug  9 10:36:57 db24 kernel: Call Trace:
Aug  9 10:36:57 db24 kernel: [<ffffffff8136e926>] schedule+0x55/0x57
Aug  9 10:36:57 db24 kernel: [<ffffffff8136fb1e>] rwsem_down_failed_common+0xe4/0x116
Aug  9 10:36:57 db24 kernel: [<ffffffff8136fb63>] rwsem_down_write_failed+0x13/0x15
Aug  9 10:36:57 db24 kernel: [<ffffffff811c4503>] call_rwsem_down_write_failed+0x13/0x20
Aug  9 10:36:57 db24 kernel: [<ffffffff8136f35c>] ? down_write+0x20/0x22
Aug  9 10:36:57 db24 kernel: [<ffffffff810d1e44>] thaw_super+0x1c/0xae
Aug  9 10:36:57 db24 kernel: [<ffffffff810f6b5e>] thaw_bdev+0x45/0x74
Aug  9 10:36:57 db24 kernel: [<ffffffffa002019c>] VmSyncThawDevices+0x56/0x9e [vmsync]
Aug  9 10:36:57 db24 kernel: [<ffffffffa0020542>] VmSyncUnlockedIoctl+0x1b4/0x208 [vmsync]
Aug  9 10:36:57 db24 kernel: [<ffffffff81115dd1>] proc_reg_unlocked_ioctl+0x82/0xa1
Aug  9 10:36:57 db24 kernel: [<ffffffff8102c05d>] ? pick_next_task_fair+0xfc/0x10c
Aug  9 10:36:57 db24 kernel: [<ffffffffa002038e>] ? VmSyncAddPath+0xff/0xff [vmsync]
Aug  9 10:36:57 db24 kernel: [<ffffffff810dcf9f>] do_vfs_ioctl+0x3f6/0x47b
Aug  9 10:36:57 db24 kernel: [<ffffffff810dd066>] sys_ioctl+0x42/0x65
Aug  9 10:36:57 db24 kernel: [<ffffffff813705d2>] system_call_fastpath+0x16/0x1b
Aug  9 10:36:57 db24 kernel: INFO: task postmaster:3112 blocked for more than 120 seconds.
Aug  9 10:36:57 db24 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug  9 10:36:57 db24 kernel: ffff8800563f5d18 0000000000000082 0000000000000000 0000000000000000
Aug  9 10:36:57 db24 kernel: ffff8800563f4010 ffff8800590d93e0 0000000000010940 ffff8800563f5fd8
Aug  9 10:36:57 db24 kernel: ffff8800563f5fd8 0000000000010940 ffff88005ab9c240 ffff8800590d93e0
Aug  9 10:36:57 db24 kernel: Call Trace:
Aug  9 10:36:57 db24 kernel: [<ffffffff8136e926>] schedule+0x55/0x57
Aug  9 10:36:57 db24 kernel: [<ffffffff8114ed8b>] xfs_file_aio_write+0x118/0x224
Aug  9 10:36:57 db24 kernel: [<ffffffff8104c59a>] ? wake_up_bit+0x25/0x25
Aug  9 10:36:57 db24 kernel: [<ffffffff810cf526>] do_sync_write+0xc9/0x106
Aug  9 10:36:57 db24 kernel: [<ffffffff8100875a>] ? read_tsc+0x9/0x1b
Aug  9 10:36:57 db24 kernel: [<ffffffff810528b0>] ? timekeeping_get_ns+0x15/0x34
Aug  9 10:36:57 db24 kernel: [<ffffffff81053370>] ? ktime_get_ts+0x80/0x89
Aug  9 10:36:57 db24 kernel: [<ffffffff810dda04>] ? poll_select_copy_remaining+0xc6/0xea
Aug  9 10:36:57 db24 kernel: [<ffffffff810cfe57>] vfs_write+0xa9/0x105
Aug  9 10:36:57 db24 kernel: [<ffffffff810cff69>] sys_write+0x45/0x69
Aug  9 10:36:57 db24 kernel: [<ffffffff813705d2>] system_call_fastpath+0x16/0x1b
Aug  9 10:36:57 db24 kernel: INFO: task postmaster:3115 blocked for more than 120 seconds.
Aug  9 10:36:57 db24 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug  9 10:36:57 db24 kernel: ffff8800563edb38 0000000000000082 0000000000000000 ffff880000000000
Aug  9 10:36:57 db24 kernel: ffff8800563ec010 ffff8800590dcf80 0000000000010940 ffff8800563edfd8
Aug  9 10:36:57 db24 kernel: ffff8800563edfd8 0000000000010940 ffff88005ab9c240 ffff8800590dcf80
Aug  9 10:36:57 db24 kernel: Call Trace:
Aug  9 10:36:57 db24 kernel: [<ffffffff8136e926>] schedule+0x55/0x57
Aug  9 10:36:57 db24 kernel: [<ffffffff8118a64a>] xfs_trans_alloc+0x79/0xa7
Aug  9 10:36:57 db24 kernel: [<ffffffff8104c59a>] ? wake_up_bit+0x25/0x25
Aug  9 10:36:57 db24 kernel: [<ffffffff8115b99a>] xfs_create+0x18c/0x4c7
Aug  9 10:36:57 db24 kernel: [<ffffffff81150b56>] ? xfs_ilock+0x59/0x5e
Aug  9 10:36:57 db24 kernel: [<ffffffff810df065>] ? spin_lock+0x9/0xb
Aug  9 10:36:57 db24 kernel: [<ffffffff810df877>] ? _d_rehash+0x1f/0x22
Aug  9 10:36:57 db24 kernel: [<ffffffff811541d1>] xfs_vn_mknod+0xca/0x15e
Aug  9 10:36:57 db24 kernel: [<ffffffff81154280>] xfs_vn_create+0xb/0xd
Aug  9 10:36:57 db24 kernel: [<ffffffff810da7ad>] vfs_create+0x4e/0x6e
Aug  9 10:36:57 db24 kernel: [<ffffffff810db68c>] do_last+0x201/0x502
Aug  9 10:36:57 db24 kernel: [<ffffffff810dba61>] path_openat+0xd4/0x31a
Aug  9 10:36:57 db24 kernel: [<ffffffff810c13f4>] ? kmem_cache_free+0x15/0x6e
Aug  9 10:36:57 db24 kernel: [<ffffffff810dbd96>] do_filp_open+0x38/0x86
Aug  9 10:36:57 db24 kernel: [<ffffffff810e507a>] ? alloc_fd+0x116/0x129
Aug  9 10:36:57 db24 kernel: [<ffffffff810ce147>] do_sys_open+0x6e/0x107
Aug  9 10:36:57 db24 kernel: [<ffffffff810ce209>] sys_open+0x1b/0x1d
Aug  9 10:36:57 db24 kernel: [<ffffffff813705d2>] system_call_fastpath+0x16/0x1b
Aug  9 10:36:57 db24 kernel: INFO: task postmaster:13762 blocked for more than 120 seconds.
Aug  9 10:36:57 db24 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug  9 10:36:57 db24 kernel: ffff880029025d18 0000000000000082 ffff880000000000 0000000000000000
Aug  9 10:36:57 db24 kernel: ffff880029024010 ffff880056260d40 0000000000010940 ffff880029025fd8
Aug  9 10:36:57 db24 kernel: ffff880029025fd8 0000000000010940 ffffffff8160d020 ffff880056260d40
Aug  9 10:36:57 db24 kernel: Call Trace:
Aug  9 10:36:57 db24 kernel: [<ffffffff8136e926>] schedule+0x55/0x57
Aug  9 10:36:57 db24 kernel: [<ffffffff8114ed8b>] xfs_file_aio_write+0x118/0x224
Aug  9 10:36:57 db24 kernel: [<ffffffff8104c59a>] ? wake_up_bit+0x25/0x25
Aug  9 10:36:57 db24 kernel: [<ffffffff810cf526>] do_sync_write+0xc9/0x106
Aug  9 10:36:57 db24 kernel: [<ffffffff810cfe57>] vfs_write+0xa9/0x105
Aug  9 10:36:57 db24 kernel: [<ffffffff810cff69>] sys_write+0x45/0x69
Aug  9 10:36:57 db24 kernel: [<ffffffff813705d2>] system_call_fastpath+0x16/0x1b

I then tried
# echo 3 >/proc/sys/vm/drop_caches
but the system hung completely.

This is a VMware VM, and I have no idea how to send SYS-T or similar keys to it.
I had to reset it (reboot hung), and everything worked again. This VM is now on kernel 3.5

--
mit freundlichen Grüssen,
Michael Monnerie, Ing. BSc

it-management Internet Services: Protéger
http://proteger.at [gesprochen: Prot-e-schee]
Tel: +43 660 / 415 6531

Attachment: signature.asc
Description: This is a digitally signed message part.

_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs

[Index of Archives]     [Linux XFS Devel]     [Linux Filesystem Development]     [Filesystem Testing]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux