nginx blocked task when XFS partition is very full

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,


This is on 2.6.32-32-server from Ubuntu Lucid 10.04 LTS

We have nginx fileserving from four disks on a machine.  We're using mogilefs as a distributed file system.  We've come across an issue where when the filesystem is very full, and nginx tries to write to it, we get "task blocked for more than 120 seconds" messages from the kernel, nginx hangs and we can no longer access the fs.  When a file is uploaded via nginx, nginx writes to a .tmp directory on the disk before moving the file to its destination.  nginx can't be kill -9ed, it needs a reboot.

Has this sort of issue been fixed since 2.6.32?


Thanks,

Sean


Here's the gory details:

[340381.743426] INFO: task flush-8:48:506 blocked for more than 120 seconds.
[340381.743802] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[340381.750570] flush-8:48    D 0000000000000000     0   506      2 0x00000000
[340381.750576]  ffff880126ea59b0 0000000000000046 0000000000015f00 0000000000015f00
[340381.750581]  ffff880126c5c8d0 ffff880126ea5fd8 0000000000015f00 ffff880126c5c500
[340381.750585]  0000000000015f00 ffff880126ea5fd8 0000000000015f00 ffff880126c5c8d0
[340381.750589] Call Trace:
[340381.750599]  [<ffffffff810f6410>] ? sync_page+0x0/0x50
[340381.750607]  [<ffffffff81558013>] io_schedule+0x73/0xc0
[340381.750610]  [<ffffffff810f644d>] sync_page+0x3d/0x50
[340381.750615]  [<ffffffff8155850a>] __wait_on_bit_lock+0x5a/0xc0
[340381.750618]  [<ffffffff810f63e7>] __lock_page+0x67/0x70
[340381.750624]  [<ffffffff810860d0>] ? wake_bit_function+0x0/0x40
[340381.750629]  [<ffffffff81100a15>] ? pagevec_lookup_tag+0x25/0x40
[340381.750632]  [<ffffffff810ffa87>] write_cache_pages+0x2f7/0x3e0
[340381.750637]  [<ffffffff810fe7d0>] ? __writepage+0x0/0x40
[340381.750641]  [<ffffffff810ffb94>] generic_writepages+0x24/0x30
[340381.750664]  [<ffffffffa0237afd>] xfs_vm_writepages+0x5d/0x80 [xfs]
[340381.750668]  [<ffffffff810ffbc1>] do_writepages+0x21/0x40
[340381.750673]  [<ffffffff81168db6>] writeback_single_inode+0xf6/0x3d0
[340381.750677]  [<ffffffff811694e5>] writeback_sb_inodes+0x195/0x280
[340381.750680]  [<ffffffff81169d00>] writeback_inodes_wb+0xa0/0x1b0
[340381.750684]  [<ffffffff8116a04b>] wb_writeback+0x23b/0x2a0
[340381.750688]  [<ffffffff81077bec>] ? lock_timer_base+0x3c/0x70
[340381.750692]  [<ffffffff8116a22c>] wb_do_writeback+0x17c/0x190
[340381.750696]  [<ffffffff81077d00>] ? process_timeout+0x0/0x10
[340381.750700]  [<ffffffff8116a293>] bdi_writeback_task+0x53/0xf0
[340381.750704]  [<ffffffff81111636>] bdi_start_fn+0x86/0x100
[340381.750708]  [<ffffffff811115b0>] ? bdi_start_fn+0x0/0x100
[340381.750711]  [<ffffffff81085d16>] kthread+0x96/0xa0
[340381.750716]  [<ffffffff810141ea>] child_rip+0xa/0x20
[340381.750719]  [<ffffffff81085c80>] ? kthread+0x0/0xa0
[340381.750723]  [<ffffffff810141e0>] ? child_rip+0x0/0x20

[340381.750732] INFO: task nginx:3581 blocked for more than 120 seconds.
[340381.755178] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[340381.765161] nginx         D 0000000000000000     0  3581   3580 0x00000000
[340381.765166]  ffff880128a27d38 0000000000000082 0000000000015f00 0000000000015f00
[340381.765170]  ffff88012742b1d0 ffff880128a27fd8 0000000000015f00 ffff88012742ae00
[340381.765174]  0000000000015f00 ffff880128a27fd8 0000000000015f00 ffff88012742b1d0
[340381.765178] Call Trace:
[340381.765184]  [<ffffffff81558d47>] __mutex_lock_slowpath+0x107/0x190
[340381.765189]  [<ffffffff811517c1>] ? path_put+0x31/0x40
[340381.765193]  [<ffffffff81558743>] mutex_lock+0x23/0x50
[340381.765197]  [<ffffffff81155139>] do_filp_open+0x3d9/0xba0
[340381.765203]  [<ffffffff810397f9>] ? default_spin_lock_flags+0x9/0x10
[340381.765209]  [<ffffffff81160aca>] ? alloc_fd+0x10a/0x150
[340381.765214]  [<ffffffff81144789>] do_sys_open+0x69/0x170
[340381.765217]  [<ffffffff811448d0>] sys_open+0x20/0x30
[340381.765223]  [<ffffffff810131b2>] system_call_fastpath+0x16/0x1b

... a couple of similar ones for nginx:3582, nginx:3583, nginx:3584, followed by ...

[340381.838307] INFO: task nginx:3585 blocked for more than 120 seconds.
[340381.849329] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[340381.872639] nginx         D 0000000000000000     0  3585   3580 0x00000000
[340381.872643]  ffff88012b3f9098 0000000000000086 0000000000015f00 0000000000015f00
[340381.872648]  ffff88012a3c03d0 ffff88012b3f9fd8 0000000000015f00 ffff88012a3c0000
[340381.872652]  0000000000015f00 ffff88012b3f9fd8 0000000000015f00 ffff88012a3c03d0
[340381.872656] Call Trace:
[340381.872660]  [<ffffffff8155837d>] schedule_timeout+0x22d/0x300
[340381.872676]  [<ffffffffa01fea12>] ? xfs_btree_is_lastrec+0x52/0x70 [xfs]
[340381.872691]  [<ffffffffa0203d04>] ? xfs_btree_insrec+0x2e4/0x5a0 [xfs]
[340381.872695]  [<ffffffff815592ae>] __down+0x7e/0xc0
[340381.872699]  [<ffffffff8108b021>] down+0x41/0x50
[340381.872714]  [<ffffffffa023b1f3>] xfs_buf_lock+0x23/0x60 [xfs]
[340381.872732]  [<ffffffffa023b375>] _xfs_buf_find+0x145/0x240 [xfs]
[340381.872749]  [<ffffffffa023b4d0>] xfs_buf_get_flags+0x60/0x170 [xfs]
[340381.872768]  [<ffffffffa02307eb>] xfs_trans_get_buf+0xeb/0x140 [xfs]
[340381.872786]  [<ffffffffa01ff74e>] xfs_btree_get_bufs+0x3e/0x40 [xfs]
[340381.872801]  [<ffffffffa01ec380>] xfs_alloc_fix_freelist+0x110/0x480 [xfs]
[340381.872821]  [<ffffffffa022ae5f>] ? xfs_icsb_disable_counter+0x1f/0x150 [xfs]
[340381.872829]  [<ffffffff812c1d90>] ? __bitmap_weight+0x50/0xb0
[340381.872847]  [<ffffffffa022afcb>] ? xfs_icsb_balance_counter_locked+0x3b/0xd0 [xfs]
[340381.872864]  [<ffffffffa01ec958>] xfs_alloc_vextent+0x178/0x490 [xfs]
[340381.872880]  [<ffffffffa01fa856>] xfs_bmap_btalloc+0x176/0x9f0 [xfs]
[340381.872897]  [<ffffffffa01fbfb1>] xfs_bmap_alloc+0x21/0x40 [xfs]
[340381.872914]  [<ffffffffa01fcb6f>] xfs_bmapi+0xb9f/0x1290 [xfs]
[340381.872933]  [<ffffffffa0217ba8>] ? xfs_iext_get_ext+0x38/0x80 [xfs]
[340381.872952]  [<ffffffffa021e055>] xfs_iomap_write_allocate+0x1c5/0x3c0 [xfs]
[340381.872972]  [<ffffffffa021edab>] xfs_iomap+0x2ab/0x2e0 [xfs]
[340381.872989]  [<ffffffffa023745d>] xfs_map_blocks+0x2d/0x40 [xfs]
[340381.873007]  [<ffffffffa023886a>] xfs_page_state_convert+0x3da/0x720 [xfs]
[340381.873025]  [<ffffffffa0238d0a>] xfs_vm_writepage+0x7a/0x130 [xfs]
[340381.873034]  [<ffffffff8110f91e>] ? __dec_zone_page_state+0x2e/0x30
[340381.873041]  [<ffffffff810fe7e7>] __writepage+0x17/0x40
[340381.873046]  [<ffffffff810ff967>] write_cache_pages+0x1d7/0x3e0
[340381.873053]  [<ffffffff810fe7d0>] ? __writepage+0x0/0x40
[340381.873059]  [<ffffffff810ffb94>] generic_writepages+0x24/0x30
[340381.873076]  [<ffffffffa0237afd>] xfs_vm_writepages+0x5d/0x80 [xfs]
[340381.873083]  [<ffffffff810ffbc1>] do_writepages+0x21/0x40
[340381.873089]  [<ffffffff810f6beb>] __filemap_fdatawrite_range+0x5b/0x60
[340381.873096]  [<ffffffff810f6f1f>] filemap_fdatawrite+0x1f/0x30
[340381.873113]  [<ffffffffa023bf79>] xfs_flush_pages+0xa9/0xc0 [xfs]
[340381.873130]  [<ffffffffa0240448>] xfs_write+0x838/0x9f0 [xfs]
[340381.873137]  [<ffffffff81077bec>] ? lock_timer_base+0x3c/0x70
[340381.873154]  [<ffffffffa023bde2>] xfs_file_aio_write+0x62/0x70 [xfs]
[340381.873162]  [<ffffffff8114679a>] do_sync_write+0xfa/0x140
[340381.873169]  [<ffffffff81086090>] ? autoremove_wake_function+0x0/0x40
[340381.873177]  [<ffffffff812552b6>] ? security_file_permission+0x16/0x20
[340381.873184]  [<ffffffff81146a98>] vfs_write+0xb8/0x1a0
[340381.873190]  [<ffffffff81147402>] sys_pwrite64+0x82/0xa0
[340381.873197]  [<ffffffff810131b2>] system_call_fastpath+0x16/0x1b

[340381.912512] INFO: task nginx:3587 blocked for more than 120 seconds.
[340381.926794] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[340381.956631] nginx         D 0000000000000000     0  3587   3580 0x00000000
[340381.956635]  ffff88012a28da38 0000000000000086 0000000000015f00 0000000000015f00
[340381.956639]  ffff88012a3c31d0 ffff88012a28dfd8 0000000000015f00 ffff88012a3c2e00
[340381.956643]  0000000000015f00 ffff88012a28dfd8 0000000000015f00 ffff88012a3c31d0
[340381.956647] Call Trace:
[340381.956652]  [<ffffffff8155837d>] schedule_timeout+0x22d/0x300
[340381.956656]  [<ffffffff8105f022>] ? enqueue_entity+0x132/0x1b0
[340381.956660]  [<ffffffff8105f0fb>] ? enqueue_task_fair+0x5b/0xa0
[340381.956664]  [<ffffffff815574f6>] wait_for_common+0xd6/0x180
[340381.956668]  [<ffffffff8105df00>] ? default_wake_function+0x0/0x20
[340381.956671]  [<ffffffff8105df55>] ? wake_up_process+0x15/0x20
[340381.956675]  [<ffffffff8155765d>] wait_for_completion+0x1d/0x20
[340381.956690]  [<ffffffffa0243a27>] xfs_flush_inodes+0x67/0x90 [xfs]
[340381.956706]  [<ffffffffa0233348>] xfs_create+0x328/0x5f0 [xfs]
[340381.956725]  [<ffffffffa022f749>] ? xfs_trans_unlocked_item+0x39/0x60 [xfs]
[340381.956742]  [<ffffffffa023eb97>] xfs_vn_mknod+0xa7/0x1c0 [xfs]
[340381.956759]  [<ffffffffa023ece0>] xfs_vn_create+0x10/0x20 [xfs]
[340381.956766]  [<ffffffff81151f34>] vfs_create+0xb4/0xe0
[340381.956772]  [<ffffffff81152024>] __open_namei_create+0xc4/0x110
[340381.956779]  [<ffffffff811557cb>] do_filp_open+0xa6b/0xba0
[340381.956786]  [<ffffffff810397f9>] ? default_spin_lock_flags+0x9/0x10
[340381.956793]  [<ffffffff81160aca>] ? alloc_fd+0x10a/0x150
[340381.956799]  [<ffffffff81144789>] do_sys_open+0x69/0x170
[340381.956805]  [<ffffffff811448d0>] sys_open+0x20/0x30
[340381.956811]  [<ffffffff810131b2>] system_call_fastpath+0x16/0x1b
_______________________________________________
xfs mailing list
xfs@xxxxxxxxxxx
http://oss.sgi.com/mailman/listinfo/xfs


[Index of Archives]     [Linux XFS Devel]     [Linux Filesystem Development]     [Filesystem Testing]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux