Re: repeatable hang with loop mount and heavy IO in guest

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Some months ago I also thought elevator=noop should be a good idea.
But it isn't. It works good as long as you only do short IO requests.
Try using deadline in host and guest.

Robert


On 01/21/10 18:26, Antoine Martin wrote:
> I've tried various guests, including most recent Fedora12 kernels,
> custom 2.6.32.x
> All of them hang around the same point (~1GB written) when I do heavy IO
> write inside the guest.
> I have waited 30 minutes to see if the guest would recover, but it just
> sits there, not writing back any data, not doing anything - but
> certainly not allowing any new IO writes. The host has some load on it,
> but nothing heavy enough to completely hand a guest for that long.
> 
> mount -o loop some_image.fs ./somewhere bs=512
> dd if=/dev/zero of=/somewhere/zero
> then after ~1GB: sync
> 
> Host is running: 2.6.31.4
> QEMU PC emulator version 0.10.50 (qemu-kvm-devel-88)
> 
> Guests are booted with "elevator=noop" as the filesystems are stored as
> files, accessed as virtio disks.
> 
> 
> The "hung" backtraces always look similar to these:
> [  361.460136] INFO: task loop0:2097 blocked for more than 120 seconds.
> [  361.460139] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [  361.460142] loop0         D ffff88000b92c848     0  2097      2
> 0x00000080
> [  361.460148]  ffff88000b92c5d0 0000000000000046 ffff880008c1f810
> ffff880009829fd8
> [  361.460153]  ffff880009829fd8 ffff880009829fd8 ffff88000a21ee80
> ffff88000b92c5d0
> [  361.460157]  ffff880009829610 ffffffff8181b768 ffff880001af33b0
> 0000000000000002
> [  361.460161] Call Trace:
> [  361.460216]  [<ffffffff8105bf12>] ? sync_page+0x0/0x43
> [  361.460253]  [<ffffffff8151383e>] ? io_schedule+0x2c/0x43
> [  361.460257]  [<ffffffff8105bf50>] ? sync_page+0x3e/0x43
> [  361.460261]  [<ffffffff81513a2a>] ? __wait_on_bit+0x41/0x71
> [  361.460264]  [<ffffffff8105c092>] ? wait_on_page_bit+0x6a/0x70
> [  361.460283]  [<ffffffff810385a7>] ? wake_bit_function+0x0/0x23
> [  361.460287]  [<ffffffff81064975>] ? shrink_page_list+0x3e5/0x61e
> [  361.460291]  [<ffffffff81513992>] ? schedule_timeout+0xa3/0xbe
> [  361.460305]  [<ffffffff81038579>] ? autoremove_wake_function+0x0/0x2e
> [  361.460308]  [<ffffffff8106538f>] ? shrink_zone+0x7e1/0xaf6
> [  361.460310]  [<ffffffff81061725>] ? determine_dirtyable_memory+0xd/0x17
> [  361.460314]  [<ffffffff810637da>] ? isolate_pages_global+0xa3/0x216
> [  361.460316]  [<ffffffff81062712>] ? mark_page_accessed+0x2a/0x39
> [  361.460335]  [<ffffffff810a61db>] ? __find_get_block+0x13b/0x15c
> [  361.460337]  [<ffffffff81065ed4>] ? try_to_free_pages+0x1ab/0x2c9
> [  361.460340]  [<ffffffff81063737>] ? isolate_pages_global+0x0/0x216
> [  361.460343]  [<ffffffff81060baf>] ? __alloc_pages_nodemask+0x394/0x564
> [  361.460350]  [<ffffffff8108250c>] ? __slab_alloc+0x137/0x44f
> [  361.460371]  [<ffffffff812cc4c1>] ? radix_tree_preload+0x1f/0x6a
> [  361.460374]  [<ffffffff81082a08>] ? kmem_cache_alloc+0x5d/0x88
> [  361.460376]  [<ffffffff812cc4c1>] ? radix_tree_preload+0x1f/0x6a
> [  361.460379]  [<ffffffff8105c0b5>] ? add_to_page_cache_locked+0x1d/0xf1
> [  361.460381]  [<ffffffff8105c1b0>] ? add_to_page_cache_lru+0x27/0x57
> [  361.460384]  [<ffffffff8105c25a>] ?
> grab_cache_page_write_begin+0x7a/0xa0
> [  361.460399]  [<ffffffff81104620>] ? ext3_write_begin+0x7e/0x201
> [  361.460417]  [<ffffffff8134648f>] ? do_lo_send_aops+0xa1/0x174
> [  361.460420]  [<ffffffff81081948>] ? virt_to_head_page+0x9/0x2a
> [  361.460422]  [<ffffffff8134686b>] ? loop_thread+0x309/0x48a
> [  361.460425]  [<ffffffff813463ee>] ? do_lo_send_aops+0x0/0x174
> [  361.460427]  [<ffffffff81038579>] ? autoremove_wake_function+0x0/0x2e
> [  361.460430]  [<ffffffff81346562>] ? loop_thread+0x0/0x48a
> [  361.460432]  [<ffffffff8103819b>] ? kthread+0x78/0x80
> [  361.460441]  [<ffffffff810238df>] ? finish_task_switch+0x2b/0x78
> [  361.460454]  [<ffffffff81002f6a>] ? child_rip+0xa/0x20
> [  361.460460]  [<ffffffff81012ac3>] ? native_pax_close_kernel+0x0/0x32
> [  361.460463]  [<ffffffff81038123>] ? kthread+0x0/0x80
> [  361.460469]  [<ffffffff81002f60>] ? child_rip+0x0/0x20
> [  361.460471] INFO: task kjournald:2098 blocked for more than 120 seconds.
> [  361.460473] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [  361.460474] kjournald     D ffff88000b92e558     0  2098      2
> 0x00000080
> [  361.460477]  ffff88000b92e2e0 0000000000000046 ffff88000aad9840
> ffff88000983ffd8
> [  361.460480]  ffff88000983ffd8 ffff88000983ffd8 ffffffff81808e00
> ffff88000b92e2e0
> [  361.460483]  ffff88000983fcf0 ffffffff8181b768 ffff880001af3c40
> 0000000000000002
> [  361.460486] Call Trace:
> [  361.460488]  [<ffffffff810a6b16>] ? sync_buffer+0x0/0x3c
> [  361.460491]  [<ffffffff8151383e>] ? io_schedule+0x2c/0x43
> [  361.460494]  [<ffffffff810a6b4e>] ? sync_buffer+0x38/0x3c
> [  361.460496]  [<ffffffff81513a2a>] ? __wait_on_bit+0x41/0x71
> [  361.460499]  [<ffffffff810a6b16>] ? sync_buffer+0x0/0x3c
> [  361.460501]  [<ffffffff81513ac4>] ? out_of_line_wait_on_bit+0x6a/0x76
> [  361.460504]  [<ffffffff810385a7>] ? wake_bit_function+0x0/0x23
> [  361.460514]  [<ffffffff8113edad>] ?
> journal_commit_transaction+0x769/0xbb8
> [  361.460517]  [<ffffffff810238df>] ? finish_task_switch+0x2b/0x78
> [  361.460519]  [<ffffffff815137d9>] ? thread_return+0x40/0x79
> [  361.460522]  [<ffffffff8114162d>] ? kjournald+0xc7/0x1cb
> [  361.460525]  [<ffffffff81038579>] ? autoremove_wake_function+0x0/0x2e
> [  361.460527]  [<ffffffff81141566>] ? kjournald+0x0/0x1cb
> [  361.460530]  [<ffffffff8103819b>] ? kthread+0x78/0x80
> [  361.460532]  [<ffffffff810238df>] ? finish_task_switch+0x2b/0x78
> [  361.460534]  [<ffffffff81002f6a>] ? child_rip+0xa/0x20
> [  361.460537]  [<ffffffff81012ac3>] ? native_pax_close_kernel+0x0/0x32
> [  361.460540]  [<ffffffff81038123>] ? kthread+0x0/0x80
> [  361.460542]  [<ffffffff81002f60>] ? child_rip+0x0/0x20
> [  361.460544] INFO: task dd:2132 blocked for more than 120 seconds.
> [  361.460546] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [  361.460547] dd            D ffff88000a21f0f8     0  2132   2090
> 0x00000080
> [  361.460550]  ffff88000a21ee80 0000000000000082 ffff88000a21ee80
> ffff88000b3affd8
> [  361.460553]  ffff88000b3affd8 ffff88000b3affd8 ffffffff81808e00
> ffff880001af3510
> [  361.460556]  ffff88000b78eaf0 ffff88000b3daa00 ffff880008de6c40
> ffff88000ab44a80
> [  361.460558] Call Trace:
> [  361.460561]  [<ffffffff8113dda5>] ? do_get_write_access+0x1f5/0x3b6
> [  361.460564]  [<ffffffff81061956>] ? get_dirty_limits+0x1dc/0x210
> [  361.460566]  [<ffffffff810385a7>] ? wake_bit_function+0x0/0x23
> [  361.460569]  [<ffffffff810a6218>] ? __getblk+0x1c/0x26c
> [  361.460576]  [<ffffffff8155d1d0>] ? __func__.28446+0x0/0x20
> [  361.460578]  [<ffffffff8113df88>] ? journal_get_write_access+0x22/0x34
> [  361.460582]  [<ffffffff8110dd9b>] ?
> __ext3_journal_get_write_access+0x1e/0x47
> [  361.460584]  [<ffffffff81101c4d>] ? ext3_reserve_inode_write+0x3e/0x75
> [  361.460587]  [<ffffffff81101c9a>] ? ext3_mark_inode_dirty+0x16/0x31
> [  361.460589]  [<ffffffff81101deb>] ? ext3_dirty_inode+0x62/0x7a
> [  361.460592]  [<ffffffff810a10d9>] ? __mark_inode_dirty+0x25/0x134
> [  361.460595]  [<ffffffff81098b80>] ? file_update_time+0xd4/0xfb
> [  361.460598]  [<ffffffff8105ced8>] ? __generic_file_aio_write+0x16c/0x290
> [  361.460600]  [<ffffffff8105d055>] ? generic_file_aio_write+0x59/0x9f
> [  361.460603]  [<ffffffff81087ab5>] ? do_sync_write+0xcd/0x112
> [  361.460606]  [<ffffffff810132d4>] ? pvclock_clocksource_read+0x3a/0x70
> [  361.460609]  [<ffffffff81038579>] ? autoremove_wake_function+0x0/0x2e
> [  361.460612]  [<ffffffff81000d1a>] ? __switch_to+0x177/0x255
> [  361.460621]  [<ffffffff8127891e>] ? selinux_file_permission+0x4d/0xa3
> [  361.460624]  [<ffffffff810883d8>] ? vfs_write+0xfc/0x138
> [  361.460627]  [<ffffffff810884d0>] ? sys_write+0x45/0x6e
> [  361.460629]  [<ffffffff810020ff>] ? system_call_fastpath+0x16/0x1b
> [  361.460632] INFO: task sync:2164 blocked for more than 120 seconds.
> [  361.460633] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [  361.460639] sync          D ffff88000ba11f88     0  2164   2136
> 0x00000080
> [  361.460642]  ffff88000ba11d10 0000000000000086 0000000100000246
> ffff88000b1e9fd8
> [  361.460645]  ffff88000b1e9fd8 ffff88000b1e9fd8 ffffffff81808e00
> ffff88000b3daa00
> [  361.460648]  00000000000001cc ffff88000b1e9e68 ffff88000b1e9e80
> ffff88000b3daa78
> [  361.460651] Call Trace:
> [  361.460653]  [<ffffffff8114122b>] ? log_wait_commit+0x9e/0xe0
> [  361.460656]  [<ffffffff81038579>] ? autoremove_wake_function+0x0/0x2e
> [  361.460659]  [<ffffffff81108fe7>] ? ext3_sync_fs+0x42/0x4b
> [  361.460669]  [<ffffffff810c9711>] ? sync_quota_sb+0x45/0xf6
> [  361.460672]  [<ffffffff810a4cd2>] ? __sync_filesystem+0x43/0x70
> [  361.460675]  [<ffffffff810a4d86>] ? sync_filesystems+0x87/0xbd
> [  361.460677]  [<ffffffff810a4e01>] ? sys_sync+0x1c/0x2e
> [  361.460679]  [<ffffffff810020ff>] ? system_call_fastpath+0x16/0x1b
> 
> -- 
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux