Re: Shutting down a VM with Kernel 4.14 will sometime hang and a reboot is the only way to recover.

David Hill <dhill@xxxxxxxxxx> · Wed, 6 Dec 2017 23:42:34 -0500

On 2017-12-06 11:34 PM, David Hill wrote:

On 2017-12-04 02:51 PM, David Hill wrote:

On 2017-12-03 11:08 PM, Jason Wang wrote:

On 2017年12月02日 00:38, David Hill wrote:

Finally, I reverted 581fe0ea61584d88072527ae9fb9dcb9d1f2783e too 
... compiling and I'll keep you posted.

So I'm still able to reproduce this issue even with reverting these 
3 commits.  Would you have other suspect commits ? 

Thanks for the testing. No, I don't have other suspect commits.

Looks like somebody else it hitting your issue too (see 
https://www.spinics.net/lists/netdev/msg468319.html)

But he claims the issue were fixed by using qemu 2.10.1.

So you may:

-try to see if qemu 2.10.1 solves your issue
It didn't solve it for him... it's only harder to reproduce. [1]
-if not, try to see if commit 
2ddf71e23cc246e95af72a6deed67b4a50a7b81c ("net: add notifier hooks 
for devmap bpf map") is the first bad commit
I'll try to see what I can do here
I'm looking at that commit and it's been introduced before v4.13 if 
I'm not mistaken while this issue appeared between v4.13 and v4.14-rc1 
.  Between those two releases, there're  1352 commits.
Is there a way to quickly know which commits are touching vhost-net, 
zerocopy ?

[ 7496.553044]  __schedule+0x2dc/0xbb0
[ 7496.553055]  ? trace_hardirqs_on+0xd/0x10
[ 7496.553074]  schedule+0x3d/0x90
[ 7496.553087]  vhost_net_ubuf_put_and_wait+0x73/0xa0 [vhost_net]
[ 7496.553100]  ? finish_wait+0x90/0x90
[ 7496.553115]  vhost_net_ioctl+0x542/0x910 [vhost_net]
[ 7496.553144]  do_vfs_ioctl+0xa6/0x6c0
[ 7496.553166]  SyS_ioctl+0x79/0x90
[ 7496.553182]  entry_SYSCALL_64_fastpath+0x1f/0xbe

That vhost_net_ubuf_put_and)wait call has been changed in this commit 
with the following comment:

commit 0ad8b480d6ee916aa84324f69acf690142aecd0e
Author: Michael S. Tsirkin <mst@xxxxxxxxxx>
Date:   Thu Feb 13 11:42:05 2014 +0200

    vhost: fix ref cnt checking deadlock

    vhost checked the counter within the refcnt before decrementing.  It
    really wanted to know that it is the one that has the last 
reference, as
    a way to batch freeing resources a bit more efficiently.

    Note: we only let refcount go to 0 on device release.

    This works well but we now access the ref counter twice so there's a
    race: all users might see a high count and decide to defer freeing
    resources.
    In the end no one initiates freeing resources until the last reference
    is gone (which is on VM shotdown so might happen after a looooong 
time).

    Let's do what we probably should have done straight away:
    switch from kref to plain atomic, documenting the
    semantics, return the refcount value atomically after decrement,
    then use that to avoid the deadlock.

    Reported-by: Qin Chuanyu <qinchuanyu@xxxxxxxxxx>
    Signed-off-by: Michael S. Tsirkin <mst@xxxxxxxxxx>
    Acked-by: Jason Wang <jasowang@xxxxxxxxxx>
    Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx>

So at this point, are we hitting a deadlock when using 
experimental_zcopytx ?

-if not, maybe you can continue your bisection through git bisect skip

Some commits are so broken that the system won't boot ...  What I 
fear is that if I git bisect skip those commits, I'll also skip the 
commit culprit of my original problem

[1] https://www.spinics.net/lists/netdev/msg469887.html