Some months ago I also thought elevator=noop should be a good idea. But it isn't. It works good as long as you only do short IO requests. Try using deadline in host and guest. Robert On 01/21/10 18:26, Antoine Martin wrote: > I've tried various guests, including most recent Fedora12 kernels, > custom 2.6.32.x > All of them hang around the same point (~1GB written) when I do heavy IO > write inside the guest. > I have waited 30 minutes to see if the guest would recover, but it just > sits there, not writing back any data, not doing anything - but > certainly not allowing any new IO writes. The host has some load on it, > but nothing heavy enough to completely hand a guest for that long. > > mount -o loop some_image.fs ./somewhere bs=512 > dd if=/dev/zero of=/somewhere/zero > then after ~1GB: sync > > Host is running: 2.6.31.4 > QEMU PC emulator version 0.10.50 (qemu-kvm-devel-88) > > Guests are booted with "elevator=noop" as the filesystems are stored as > files, accessed as virtio disks. > > > The "hung" backtraces always look similar to these: > [ 361.460136] INFO: task loop0:2097 blocked for more than 120 seconds. > [ 361.460139] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [ 361.460142] loop0 D ffff88000b92c848 0 2097 2 > 0x00000080 > [ 361.460148] ffff88000b92c5d0 0000000000000046 ffff880008c1f810 > ffff880009829fd8 > [ 361.460153] ffff880009829fd8 ffff880009829fd8 ffff88000a21ee80 > ffff88000b92c5d0 > [ 361.460157] ffff880009829610 ffffffff8181b768 ffff880001af33b0 > 0000000000000002 > [ 361.460161] Call Trace: > [ 361.460216] [<ffffffff8105bf12>] ? sync_page+0x0/0x43 > [ 361.460253] [<ffffffff8151383e>] ? io_schedule+0x2c/0x43 > [ 361.460257] [<ffffffff8105bf50>] ? sync_page+0x3e/0x43 > [ 361.460261] [<ffffffff81513a2a>] ? __wait_on_bit+0x41/0x71 > [ 361.460264] [<ffffffff8105c092>] ? wait_on_page_bit+0x6a/0x70 > [ 361.460283] [<ffffffff810385a7>] ? wake_bit_function+0x0/0x23 > [ 361.460287] [<ffffffff81064975>] ? shrink_page_list+0x3e5/0x61e > [ 361.460291] [<ffffffff81513992>] ? schedule_timeout+0xa3/0xbe > [ 361.460305] [<ffffffff81038579>] ? autoremove_wake_function+0x0/0x2e > [ 361.460308] [<ffffffff8106538f>] ? shrink_zone+0x7e1/0xaf6 > [ 361.460310] [<ffffffff81061725>] ? determine_dirtyable_memory+0xd/0x17 > [ 361.460314] [<ffffffff810637da>] ? isolate_pages_global+0xa3/0x216 > [ 361.460316] [<ffffffff81062712>] ? mark_page_accessed+0x2a/0x39 > [ 361.460335] [<ffffffff810a61db>] ? __find_get_block+0x13b/0x15c > [ 361.460337] [<ffffffff81065ed4>] ? try_to_free_pages+0x1ab/0x2c9 > [ 361.460340] [<ffffffff81063737>] ? isolate_pages_global+0x0/0x216 > [ 361.460343] [<ffffffff81060baf>] ? __alloc_pages_nodemask+0x394/0x564 > [ 361.460350] [<ffffffff8108250c>] ? __slab_alloc+0x137/0x44f > [ 361.460371] [<ffffffff812cc4c1>] ? radix_tree_preload+0x1f/0x6a > [ 361.460374] [<ffffffff81082a08>] ? kmem_cache_alloc+0x5d/0x88 > [ 361.460376] [<ffffffff812cc4c1>] ? radix_tree_preload+0x1f/0x6a > [ 361.460379] [<ffffffff8105c0b5>] ? add_to_page_cache_locked+0x1d/0xf1 > [ 361.460381] [<ffffffff8105c1b0>] ? add_to_page_cache_lru+0x27/0x57 > [ 361.460384] [<ffffffff8105c25a>] ? > grab_cache_page_write_begin+0x7a/0xa0 > [ 361.460399] [<ffffffff81104620>] ? ext3_write_begin+0x7e/0x201 > [ 361.460417] [<ffffffff8134648f>] ? do_lo_send_aops+0xa1/0x174 > [ 361.460420] [<ffffffff81081948>] ? virt_to_head_page+0x9/0x2a > [ 361.460422] [<ffffffff8134686b>] ? loop_thread+0x309/0x48a > [ 361.460425] [<ffffffff813463ee>] ? do_lo_send_aops+0x0/0x174 > [ 361.460427] [<ffffffff81038579>] ? autoremove_wake_function+0x0/0x2e > [ 361.460430] [<ffffffff81346562>] ? loop_thread+0x0/0x48a > [ 361.460432] [<ffffffff8103819b>] ? kthread+0x78/0x80 > [ 361.460441] [<ffffffff810238df>] ? finish_task_switch+0x2b/0x78 > [ 361.460454] [<ffffffff81002f6a>] ? child_rip+0xa/0x20 > [ 361.460460] [<ffffffff81012ac3>] ? native_pax_close_kernel+0x0/0x32 > [ 361.460463] [<ffffffff81038123>] ? kthread+0x0/0x80 > [ 361.460469] [<ffffffff81002f60>] ? child_rip+0x0/0x20 > [ 361.460471] INFO: task kjournald:2098 blocked for more than 120 seconds. > [ 361.460473] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [ 361.460474] kjournald D ffff88000b92e558 0 2098 2 > 0x00000080 > [ 361.460477] ffff88000b92e2e0 0000000000000046 ffff88000aad9840 > ffff88000983ffd8 > [ 361.460480] ffff88000983ffd8 ffff88000983ffd8 ffffffff81808e00 > ffff88000b92e2e0 > [ 361.460483] ffff88000983fcf0 ffffffff8181b768 ffff880001af3c40 > 0000000000000002 > [ 361.460486] Call Trace: > [ 361.460488] [<ffffffff810a6b16>] ? sync_buffer+0x0/0x3c > [ 361.460491] [<ffffffff8151383e>] ? io_schedule+0x2c/0x43 > [ 361.460494] [<ffffffff810a6b4e>] ? sync_buffer+0x38/0x3c > [ 361.460496] [<ffffffff81513a2a>] ? __wait_on_bit+0x41/0x71 > [ 361.460499] [<ffffffff810a6b16>] ? sync_buffer+0x0/0x3c > [ 361.460501] [<ffffffff81513ac4>] ? out_of_line_wait_on_bit+0x6a/0x76 > [ 361.460504] [<ffffffff810385a7>] ? wake_bit_function+0x0/0x23 > [ 361.460514] [<ffffffff8113edad>] ? > journal_commit_transaction+0x769/0xbb8 > [ 361.460517] [<ffffffff810238df>] ? finish_task_switch+0x2b/0x78 > [ 361.460519] [<ffffffff815137d9>] ? thread_return+0x40/0x79 > [ 361.460522] [<ffffffff8114162d>] ? kjournald+0xc7/0x1cb > [ 361.460525] [<ffffffff81038579>] ? autoremove_wake_function+0x0/0x2e > [ 361.460527] [<ffffffff81141566>] ? kjournald+0x0/0x1cb > [ 361.460530] [<ffffffff8103819b>] ? kthread+0x78/0x80 > [ 361.460532] [<ffffffff810238df>] ? finish_task_switch+0x2b/0x78 > [ 361.460534] [<ffffffff81002f6a>] ? child_rip+0xa/0x20 > [ 361.460537] [<ffffffff81012ac3>] ? native_pax_close_kernel+0x0/0x32 > [ 361.460540] [<ffffffff81038123>] ? kthread+0x0/0x80 > [ 361.460542] [<ffffffff81002f60>] ? child_rip+0x0/0x20 > [ 361.460544] INFO: task dd:2132 blocked for more than 120 seconds. > [ 361.460546] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [ 361.460547] dd D ffff88000a21f0f8 0 2132 2090 > 0x00000080 > [ 361.460550] ffff88000a21ee80 0000000000000082 ffff88000a21ee80 > ffff88000b3affd8 > [ 361.460553] ffff88000b3affd8 ffff88000b3affd8 ffffffff81808e00 > ffff880001af3510 > [ 361.460556] ffff88000b78eaf0 ffff88000b3daa00 ffff880008de6c40 > ffff88000ab44a80 > [ 361.460558] Call Trace: > [ 361.460561] [<ffffffff8113dda5>] ? do_get_write_access+0x1f5/0x3b6 > [ 361.460564] [<ffffffff81061956>] ? get_dirty_limits+0x1dc/0x210 > [ 361.460566] [<ffffffff810385a7>] ? wake_bit_function+0x0/0x23 > [ 361.460569] [<ffffffff810a6218>] ? __getblk+0x1c/0x26c > [ 361.460576] [<ffffffff8155d1d0>] ? __func__.28446+0x0/0x20 > [ 361.460578] [<ffffffff8113df88>] ? journal_get_write_access+0x22/0x34 > [ 361.460582] [<ffffffff8110dd9b>] ? > __ext3_journal_get_write_access+0x1e/0x47 > [ 361.460584] [<ffffffff81101c4d>] ? ext3_reserve_inode_write+0x3e/0x75 > [ 361.460587] [<ffffffff81101c9a>] ? ext3_mark_inode_dirty+0x16/0x31 > [ 361.460589] [<ffffffff81101deb>] ? ext3_dirty_inode+0x62/0x7a > [ 361.460592] [<ffffffff810a10d9>] ? __mark_inode_dirty+0x25/0x134 > [ 361.460595] [<ffffffff81098b80>] ? file_update_time+0xd4/0xfb > [ 361.460598] [<ffffffff8105ced8>] ? __generic_file_aio_write+0x16c/0x290 > [ 361.460600] [<ffffffff8105d055>] ? generic_file_aio_write+0x59/0x9f > [ 361.460603] [<ffffffff81087ab5>] ? do_sync_write+0xcd/0x112 > [ 361.460606] [<ffffffff810132d4>] ? pvclock_clocksource_read+0x3a/0x70 > [ 361.460609] [<ffffffff81038579>] ? autoremove_wake_function+0x0/0x2e > [ 361.460612] [<ffffffff81000d1a>] ? __switch_to+0x177/0x255 > [ 361.460621] [<ffffffff8127891e>] ? selinux_file_permission+0x4d/0xa3 > [ 361.460624] [<ffffffff810883d8>] ? vfs_write+0xfc/0x138 > [ 361.460627] [<ffffffff810884d0>] ? sys_write+0x45/0x6e > [ 361.460629] [<ffffffff810020ff>] ? system_call_fastpath+0x16/0x1b > [ 361.460632] INFO: task sync:2164 blocked for more than 120 seconds. > [ 361.460633] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [ 361.460639] sync D ffff88000ba11f88 0 2164 2136 > 0x00000080 > [ 361.460642] ffff88000ba11d10 0000000000000086 0000000100000246 > ffff88000b1e9fd8 > [ 361.460645] ffff88000b1e9fd8 ffff88000b1e9fd8 ffffffff81808e00 > ffff88000b3daa00 > [ 361.460648] 00000000000001cc ffff88000b1e9e68 ffff88000b1e9e80 > ffff88000b3daa78 > [ 361.460651] Call Trace: > [ 361.460653] [<ffffffff8114122b>] ? log_wait_commit+0x9e/0xe0 > [ 361.460656] [<ffffffff81038579>] ? autoremove_wake_function+0x0/0x2e > [ 361.460659] [<ffffffff81108fe7>] ? ext3_sync_fs+0x42/0x4b > [ 361.460669] [<ffffffff810c9711>] ? sync_quota_sb+0x45/0xf6 > [ 361.460672] [<ffffffff810a4cd2>] ? __sync_filesystem+0x43/0x70 > [ 361.460675] [<ffffffff810a4d86>] ? sync_filesystems+0x87/0xbd > [ 361.460677] [<ffffffff810a4e01>] ? sys_sync+0x1c/0x2e > [ 361.460679] [<ffffffff810020ff>] ? system_call_fastpath+0x16/0x1b > > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html