[Bug 197861] New: Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover.

bugzilla-daemon@xxxxxxxxxxxxxxxxxxx · Mon, 13 Nov 2017 15:35:57 +0000

https://bugzilla.kernel.org/show_bug.cgi?id=197861

            Bug ID: 197861
           Summary: Shutting down a VM with Kernel 4.14 will sometime hang
                    and a reboot is the only way to recover.
           Product: Virtualization
           Version: unspecified
    Kernel Version: 4.14-rc8
          Hardware: Intel
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: kvm
          Assignee: virtualization_kvm@xxxxxxxxxxxxxxxxxxxx
          Reporter: hilld@xxxxxxxxxxxxxxx
        Regression: No

[ 7496.552971] INFO: task qemu-system-x86:5978 blocked for more than 120
seconds.
[ 7496.552987]       Tainted: G          I     4.14.0-0.rc1.git3.1.fc28.x86_64
#1
[ 7496.552996] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
[ 7496.553006] qemu-system-x86 D12240  5978      1 0x00000004
[ 7496.553024] Call Trace:
[ 7496.553044]  __schedule+0x2dc/0xbb0
[ 7496.553055]  ? trace_hardirqs_on+0xd/0x10
[ 7496.553074]  schedule+0x3d/0x90
[ 7496.553087]  vhost_net_ubuf_put_and_wait+0x73/0xa0 [vhost_net]
[ 7496.553100]  ? finish_wait+0x90/0x90
[ 7496.553115]  vhost_net_ioctl+0x542/0x910 [vhost_net]
[ 7496.553144]  do_vfs_ioctl+0xa6/0x6c0
[ 7496.553166]  SyS_ioctl+0x79/0x90
[ 7496.553182]  entry_SYSCALL_64_fastpath+0x1f/0xbe
[ 7496.553190] RIP: 0033:0x7fa1ea0e1817
[ 7496.553196] RSP: 002b:00007ffe3d854bc8 EFLAGS: 00000246 ORIG_RAX:
0000000000000010
[ 7496.553209] RAX: ffffffffffffffda RBX: 000000000000001d RCX:
00007fa1ea0e1817
[ 7496.553215] RDX: 00007ffe3d854bd0 RSI: 000000004008af30 RDI:
0000000000000021
[ 7496.553222] RBP: 000055e33352b610 R08: 000055e33024a6f0 R09:
000055e330245d92
[ 7496.553228] R10: 000055e33344e7f0 R11: 0000000000000246 R12:
000055e33351a000
[ 7496.553235] R13: 0000000000000001 R14: 0000000400000000 R15:
0000000000000000
[ 7496.553284]
               Showing all locks held in the system:
[ 7496.553313] 1 lock held by khungtaskd/161:
[ 7496.553319]  #0:  (tasklist_lock){.+.+}, at: [<ffffffff8111740d>]
debug_show_all_locks+0x3d/0x1a0
[ 7496.553373] 1 lock held by in:imklog/1194:
[ 7496.553379]  #0:  (&f->f_pos_lock){+.+.}, at: [<ffffffff8130ecfc>]
__fdget_pos+0x4c/0x60
[ 7496.553541] 1 lock held by qemu-system-x86/5978:
[ 7496.553547]  #0:  (&dev->mutex#3){+.+.}, at: [<ffffffffc077e498>]
vhost_net_ioctl+0x358/0x910 [vhost_net]

I'm currently bisecting to figure out which commit breaks this but for some
reasons, when hitting this commit:

# bad: [46d4b68f891bee5d83a32508bfbd9778be6b1b63] Merge tag
'wireless-drivers-next-for-davem-2017-08-07' of
git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next
git bisect bad 46d4b68f891bee5d83a32508bfbd9778be6b1b63

the host will not allow SSHd to establish a new session and when starting a KVM
guest, the host will hard lock.   I'm still bisecting but I marked that commit
as bad even though perhaps it would be good.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.