I've tried various guests, including most recent Fedora12 kernels,
custom 2.6.32.x
All of them hang around the same point (~1GB written) when I do heavy IO
write inside the guest.
I have waited 30 minutes to see if the guest would recover, but it just
sits there, not writing back any data, not doing anything - but
certainly not allowing any new IO writes. The host has some load on it,
but nothing heavy enough to completely hand a guest for that long.
mount -o loop some_image.fs ./somewhere bs=512
dd if=/dev/zero of=/somewhere/zero
then after ~1GB: sync
Host is running: 2.6.31.4
QEMU PC emulator version 0.10.50 (qemu-kvm-devel-88)
Guests are booted with "elevator=noop" as the filesystems are stored as
files, accessed as virtio disks.
The "hung" backtraces always look similar to these:
[ 361.460136] INFO: task loop0:2097 blocked for more than 120 seconds.
[ 361.460139] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 361.460142] loop0 D ffff88000b92c848 0 2097 2
0x00000080
[ 361.460148] ffff88000b92c5d0 0000000000000046 ffff880008c1f810
ffff880009829fd8
[ 361.460153] ffff880009829fd8 ffff880009829fd8 ffff88000a21ee80
ffff88000b92c5d0
[ 361.460157] ffff880009829610 ffffffff8181b768 ffff880001af33b0
0000000000000002
[ 361.460161] Call Trace:
[ 361.460216] [<ffffffff8105bf12>] ? sync_page+0x0/0x43
[ 361.460253] [<ffffffff8151383e>] ? io_schedule+0x2c/0x43
[ 361.460257] [<ffffffff8105bf50>] ? sync_page+0x3e/0x43
[ 361.460261] [<ffffffff81513a2a>] ? __wait_on_bit+0x41/0x71
[ 361.460264] [<ffffffff8105c092>] ? wait_on_page_bit+0x6a/0x70
[ 361.460283] [<ffffffff810385a7>] ? wake_bit_function+0x0/0x23
[ 361.460287] [<ffffffff81064975>] ? shrink_page_list+0x3e5/0x61e
[ 361.460291] [<ffffffff81513992>] ? schedule_timeout+0xa3/0xbe
[ 361.460305] [<ffffffff81038579>] ? autoremove_wake_function+0x0/0x2e
[ 361.460308] [<ffffffff8106538f>] ? shrink_zone+0x7e1/0xaf6
[ 361.460310] [<ffffffff81061725>] ? determine_dirtyable_memory+0xd/0x17
[ 361.460314] [<ffffffff810637da>] ? isolate_pages_global+0xa3/0x216
[ 361.460316] [<ffffffff81062712>] ? mark_page_accessed+0x2a/0x39
[ 361.460335] [<ffffffff810a61db>] ? __find_get_block+0x13b/0x15c
[ 361.460337] [<ffffffff81065ed4>] ? try_to_free_pages+0x1ab/0x2c9
[ 361.460340] [<ffffffff81063737>] ? isolate_pages_global+0x0/0x216
[ 361.460343] [<ffffffff81060baf>] ? __alloc_pages_nodemask+0x394/0x564
[ 361.460350] [<ffffffff8108250c>] ? __slab_alloc+0x137/0x44f
[ 361.460371] [<ffffffff812cc4c1>] ? radix_tree_preload+0x1f/0x6a
[ 361.460374] [<ffffffff81082a08>] ? kmem_cache_alloc+0x5d/0x88
[ 361.460376] [<ffffffff812cc4c1>] ? radix_tree_preload+0x1f/0x6a
[ 361.460379] [<ffffffff8105c0b5>] ? add_to_page_cache_locked+0x1d/0xf1
[ 361.460381] [<ffffffff8105c1b0>] ? add_to_page_cache_lru+0x27/0x57
[ 361.460384] [<ffffffff8105c25a>] ? grab_cache_page_write_begin+0x7a/0xa0
[ 361.460399] [<ffffffff81104620>] ? ext3_write_begin+0x7e/0x201
[ 361.460417] [<ffffffff8134648f>] ? do_lo_send_aops+0xa1/0x174
[ 361.460420] [<ffffffff81081948>] ? virt_to_head_page+0x9/0x2a
[ 361.460422] [<ffffffff8134686b>] ? loop_thread+0x309/0x48a
[ 361.460425] [<ffffffff813463ee>] ? do_lo_send_aops+0x0/0x174
[ 361.460427] [<ffffffff81038579>] ? autoremove_wake_function+0x0/0x2e
[ 361.460430] [<ffffffff81346562>] ? loop_thread+0x0/0x48a
[ 361.460432] [<ffffffff8103819b>] ? kthread+0x78/0x80
[ 361.460441] [<ffffffff810238df>] ? finish_task_switch+0x2b/0x78
[ 361.460454] [<ffffffff81002f6a>] ? child_rip+0xa/0x20
[ 361.460460] [<ffffffff81012ac3>] ? native_pax_close_kernel+0x0/0x32
[ 361.460463] [<ffffffff81038123>] ? kthread+0x0/0x80
[ 361.460469] [<ffffffff81002f60>] ? child_rip+0x0/0x20
[ 361.460471] INFO: task kjournald:2098 blocked for more than 120 seconds.
[ 361.460473] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 361.460474] kjournald D ffff88000b92e558 0 2098 2
0x00000080
[ 361.460477] ffff88000b92e2e0 0000000000000046 ffff88000aad9840
ffff88000983ffd8
[ 361.460480] ffff88000983ffd8 ffff88000983ffd8 ffffffff81808e00
ffff88000b92e2e0
[ 361.460483] ffff88000983fcf0 ffffffff8181b768 ffff880001af3c40
0000000000000002
[ 361.460486] Call Trace:
[ 361.460488] [<ffffffff810a6b16>] ? sync_buffer+0x0/0x3c
[ 361.460491] [<ffffffff8151383e>] ? io_schedule+0x2c/0x43
[ 361.460494] [<ffffffff810a6b4e>] ? sync_buffer+0x38/0x3c
[ 361.460496] [<ffffffff81513a2a>] ? __wait_on_bit+0x41/0x71
[ 361.460499] [<ffffffff810a6b16>] ? sync_buffer+0x0/0x3c
[ 361.460501] [<ffffffff81513ac4>] ? out_of_line_wait_on_bit+0x6a/0x76
[ 361.460504] [<ffffffff810385a7>] ? wake_bit_function+0x0/0x23
[ 361.460514] [<ffffffff8113edad>] ?
journal_commit_transaction+0x769/0xbb8
[ 361.460517] [<ffffffff810238df>] ? finish_task_switch+0x2b/0x78
[ 361.460519] [<ffffffff815137d9>] ? thread_return+0x40/0x79
[ 361.460522] [<ffffffff8114162d>] ? kjournald+0xc7/0x1cb
[ 361.460525] [<ffffffff81038579>] ? autoremove_wake_function+0x0/0x2e
[ 361.460527] [<ffffffff81141566>] ? kjournald+0x0/0x1cb
[ 361.460530] [<ffffffff8103819b>] ? kthread+0x78/0x80
[ 361.460532] [<ffffffff810238df>] ? finish_task_switch+0x2b/0x78
[ 361.460534] [<ffffffff81002f6a>] ? child_rip+0xa/0x20
[ 361.460537] [<ffffffff81012ac3>] ? native_pax_close_kernel+0x0/0x32
[ 361.460540] [<ffffffff81038123>] ? kthread+0x0/0x80
[ 361.460542] [<ffffffff81002f60>] ? child_rip+0x0/0x20
[ 361.460544] INFO: task dd:2132 blocked for more than 120 seconds.
[ 361.460546] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 361.460547] dd D ffff88000a21f0f8 0 2132 2090
0x00000080
[ 361.460550] ffff88000a21ee80 0000000000000082 ffff88000a21ee80
ffff88000b3affd8
[ 361.460553] ffff88000b3affd8 ffff88000b3affd8 ffffffff81808e00
ffff880001af3510
[ 361.460556] ffff88000b78eaf0 ffff88000b3daa00 ffff880008de6c40
ffff88000ab44a80
[ 361.460558] Call Trace:
[ 361.460561] [<ffffffff8113dda5>] ? do_get_write_access+0x1f5/0x3b6
[ 361.460564] [<ffffffff81061956>] ? get_dirty_limits+0x1dc/0x210
[ 361.460566] [<ffffffff810385a7>] ? wake_bit_function+0x0/0x23
[ 361.460569] [<ffffffff810a6218>] ? __getblk+0x1c/0x26c
[ 361.460576] [<ffffffff8155d1d0>] ? __func__.28446+0x0/0x20
[ 361.460578] [<ffffffff8113df88>] ? journal_get_write_access+0x22/0x34
[ 361.460582] [<ffffffff8110dd9b>] ?
__ext3_journal_get_write_access+0x1e/0x47
[ 361.460584] [<ffffffff81101c4d>] ? ext3_reserve_inode_write+0x3e/0x75
[ 361.460587] [<ffffffff81101c9a>] ? ext3_mark_inode_dirty+0x16/0x31
[ 361.460589] [<ffffffff81101deb>] ? ext3_dirty_inode+0x62/0x7a
[ 361.460592] [<ffffffff810a10d9>] ? __mark_inode_dirty+0x25/0x134
[ 361.460595] [<ffffffff81098b80>] ? file_update_time+0xd4/0xfb
[ 361.460598] [<ffffffff8105ced8>] ? __generic_file_aio_write+0x16c/0x290
[ 361.460600] [<ffffffff8105d055>] ? generic_file_aio_write+0x59/0x9f
[ 361.460603] [<ffffffff81087ab5>] ? do_sync_write+0xcd/0x112
[ 361.460606] [<ffffffff810132d4>] ? pvclock_clocksource_read+0x3a/0x70
[ 361.460609] [<ffffffff81038579>] ? autoremove_wake_function+0x0/0x2e
[ 361.460612] [<ffffffff81000d1a>] ? __switch_to+0x177/0x255
[ 361.460621] [<ffffffff8127891e>] ? selinux_file_permission+0x4d/0xa3
[ 361.460624] [<ffffffff810883d8>] ? vfs_write+0xfc/0x138
[ 361.460627] [<ffffffff810884d0>] ? sys_write+0x45/0x6e
[ 361.460629] [<ffffffff810020ff>] ? system_call_fastpath+0x16/0x1b
[ 361.460632] INFO: task sync:2164 blocked for more than 120 seconds.
[ 361.460633] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 361.460639] sync D ffff88000ba11f88 0 2164 2136
0x00000080
[ 361.460642] ffff88000ba11d10 0000000000000086 0000000100000246
ffff88000b1e9fd8
[ 361.460645] ffff88000b1e9fd8 ffff88000b1e9fd8 ffffffff81808e00
ffff88000b3daa00
[ 361.460648] 00000000000001cc ffff88000b1e9e68 ffff88000b1e9e80
ffff88000b3daa78
[ 361.460651] Call Trace:
[ 361.460653] [<ffffffff8114122b>] ? log_wait_commit+0x9e/0xe0
[ 361.460656] [<ffffffff81038579>] ? autoremove_wake_function+0x0/0x2e
[ 361.460659] [<ffffffff81108fe7>] ? ext3_sync_fs+0x42/0x4b
[ 361.460669] [<ffffffff810c9711>] ? sync_quota_sb+0x45/0xf6
[ 361.460672] [<ffffffff810a4cd2>] ? __sync_filesystem+0x43/0x70
[ 361.460675] [<ffffffff810a4d86>] ? sync_filesystems+0x87/0xbd
[ 361.460677] [<ffffffff810a4e01>] ? sys_sync+0x1c/0x2e
[ 361.460679] [<ffffffff810020ff>] ? system_call_fastpath+0x16/0x1b
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html