On Wed 13-09-17 15:07:26, Jorgen S. Hansen wrote: > > > On Sep 12, 2017, at 11:08 AM, Michal Hocko <mhocko@xxxxxxxxxx> wrote: > > > > Hi, > > we are seeing the following splat with Debian 3.16 stable kernel > > > > BUG: scheduling while atomic: MATLAB/26771/0x00000100 > > Modules linked in: veeamsnap(O) hmac cbc cts nfsv4 dns_resolver rpcsec_gss_krb5 nfsd auth_rpcgss oid_registry nfs_acl nfs lockd fscache sunrpc vmw_vso$ > > CPU: 0 PID: 26771 Comm: MATLAB Tainted: G O 3.16.0-4-amd64 #1 Debian 3.16.7-ckt20-1+deb8u3 > > Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/21/2015 > > ffff88315c1e4c20 ffffffff8150db3f ffff88193f803dc8 ffffffff8150acdf > > ffffffff815103a2 0000000000012f00 ffff8819423dbfd8 0000000000012f00 > > ffff88315c1e4c20 ffff88193f803dc8 ffff88193f803d50 ffff88193f803dc0 > > Call Trace: > > <IRQ> [<ffffffff8150db3f>] ? dump_stack+0x41/0x51 > > [<ffffffff8150acdf>] ? __schedule_bug+0x48/0x55 > > [<ffffffff815103a2>] ? __schedule+0x5d2/0x700 > > [<ffffffff8150f9b9>] ? schedule_timeout+0x229/0x2a0 > > [<ffffffff8109ba70>] ? select_task_rq_fair+0x390/0x700 > > [<ffffffff8109f780>] ? check_preempt_wakeup+0x120/0x1d0 > > [<ffffffff81510eb8>] ? wait_for_completion+0xa8/0x120 > > [<ffffffff81096de0>] ? wake_up_state+0x10/0x10 > > [<ffffffff810c3da0>] ? call_rcu_bh+0x20/0x20 > > [<ffffffff810c180b>] ? wait_rcu_gp+0x4b/0x60 > > [<ffffffff810c17b0>] ? ftrace_raw_output_rcu_utilization+0x40/0x40 > > [<ffffffffa02ca6f5>] ? vmci_event_unsubscribe+0x75/0xb0 [vmw_vmci] > > [<ffffffffa031f5cd>] ? vmci_transport_destruct+0x1d/0xe0 [vmw_vsock_vmci_transport] > > [<ffffffffa03167e3>] ? vsock_sk_destruct+0x13/0x60 [vsock] > > [<ffffffff81409f7a>] ? __sk_free+0x1a/0x130 > > [<ffffffffa0320218>] ? vmci_transport_recv_stream_cb+0x1e8/0x2d0 [vmw_vsock_vmci_transport] > > [<ffffffffa02c9cba>] ? vmci_datagram_invoke_guest_handler+0xaa/0xd0 [vmw_vmci] > > [<ffffffffa02cab51>] ? vmci_dispatch_dgs+0xc1/0x200 [vmw_vmci] > > [<ffffffff8106c294>] ? tasklet_action+0xf4/0x100 > > [<ffffffff8106c681>] ? __do_softirq+0xf1/0x290 > > [<ffffffff8106ca55>] ? irq_exit+0x95/0xa0 > > [<ffffffff81516b22>] ? do_IRQ+0x52/0xe0 > > [<ffffffff8151496d>] ? common_interrupt+0x6d/0x6d > > > > AFAICS this has been fixed by 4ef7ea9195ea ("VSOCK: sock_put wasn't safe > > to call in interrupt context") but this patch hasn't been backported to > > stable trees. It applies cleanly on top of 3.16 stable tree but I am not > > familiar with the code to send the backport to the stable maintainer > > directly. > > > > Could you double check that the patch below (just a blind cherry-pick) > > is correct and it doesn't need additional patches on top? > > Hi, > > The patch below has been used to fix the above issue by other distros > - among them Redhat for the 3.10 kernel, so it should work for 3.16 as > well. Thanks for the confirmation. I do not see 4ef7ea9195ea ("VSOCK: sock_put wasn't safe to call in interrupt context") in 3.10 stable branch though. > In addition to the patch above, there are two other patches that > need to be applied on top for the fix to be correct: > > 8566b86ab9f0f45bc6f7dd422b21de9d0cf5415a "VSOCK: Fix lockdep issue." > > and > > 8ab18d71de8b07d2c4d6f984b718418c09ea45c5 "VSOCK: Detach QP check should filter out non matching QPs." Good to know. I will send all three patches cherry-picked on top of the current 3.16 stable branch. Could you have a look please? -- Michal Hocko SUSE Labs