Potential deadlock in Linux VMCI driver

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



During my development, I enabled some Linux kernel checkers, specifically
the “sleep in atomic” checker.

I ran into unrelated issue that appears to be a result of commit
463713eb6164b6 ("VMCI: dma dg: add support for DMA datagrams receive”).
IIUC, vmci_read_data() calls wait_event(), which is not allowed while IRQs
are disabled, which they are during IRQ handling.

I think "CONFIG_DEBUG_ATOMIC_SLEEP=y" is the one that triggers the warning
below, which indicates a deadlock is possible.

The splat below (after decoding) was experienced on Linux 5.19. Let me know
if you need me to open a bug in bugzilla or whether this issue is already
known.


[   22.629691] BUG: sleeping function called from invalid context at drivers/misc/vmw_vmci/vmci_guest.c:145
[   22.633894] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 775, name: cloud-init
[   22.638232] preempt_count: 100, expected: 0
[   22.641887] RCU nest depth: 0, expected: 0
[   22.645461] 1 lock held by cloud-init/775:
[   22.649013] #0: ffff88810e057200 (&type->i_mutex_dir_key#6){++++}-{3:3}, at: iterate_dir (fs/readdir.c:46) 
[   22.653012] Preemption disabled at:
[   22.653017] __do_softirq (kernel/softirq.c:504 kernel/softirq.c:548) 
[   22.660264] CPU: 3 PID: 775 Comm: cloud-init Not tainted 5.19.0+ #3
[   22.664004] Hardware name: VMware, Inc. VMware20,1/440BX Desktop Reference Platform, BIOS VMW201.00V.20253199.B64.2208081742 08/08/2022
[   22.671600] Call Trace:
[   22.675165]  <IRQ>
[   22.678681] dump_stack_lvl (lib/dump_stack.c:107 (discriminator 4)) 
[   22.682303] dump_stack (lib/dump_stack.c:114) 
[   22.685883] __might_resched.cold (kernel/sched/core.c:9822) 
[   22.689500] __might_sleep (kernel/sched/core.c:9751 (discriminator 14)) 
[   22.692961] vmci_read_data (./include/linux/kernel.h:110 drivers/misc/vmw_vmci/vmci_guest.c:145) vmw_vmci
[   22.696461] ? vmci_interrupt_bm (drivers/misc/vmw_vmci/vmci_guest.c:121) vmw_vmci
[   22.699920] ? __this_cpu_preempt_check (lib/smp_processor_id.c:67) 
[   22.703305] ? wake_up_var (./include/linux/list.h:292 ./include/linux/wait.h:129 kernel/sched/wait_bit.c:125 kernel/sched/wait_bit.c:193) 
[   22.706526] ? cpuusage_read (kernel/sched/wait_bit.c:192) 
[   22.709682] ? mark_held_locks (kernel/locking/lockdep.c:4234) 
[   22.712779] vmci_dispatch_dgs (drivers/misc/vmw_vmci/vmci_guest.c:332) vmw_vmci
[   22.715923] tasklet_action_common.constprop.0 (kernel/softirq.c:799) 
[   22.719008] ? vmci_read_data (drivers/misc/vmw_vmci/vmci_guest.c:308) vmw_vmci
[   22.722018] tasklet_action (kernel/softirq.c:819) 
[   22.724865] __do_softirq (kernel/softirq.c:571) 
[   22.727650] __irq_exit_rcu (kernel/softirq.c:445 kernel/softirq.c:650) 
[   22.730348] irq_exit_rcu (kernel/softirq.c:664) 
[   22.732947] common_interrupt (arch/x86/kernel/irq.c:240 (discriminator 14)) 
[   22.735513]  </IRQ>
[   22.737879]  <TASK>
[   22.740141] asm_common_interrupt (./arch/x86/include/asm/idtentry.h:640) 
[   22.742498] RIP: 0010:stack_trace_consume_entry (kernel/stacktrace.c:83) 
[ 22.744891] Code: be 80 01 00 00 48 c7 c7 40 82 cd 82 48 89 e5 e8 7d 38 53 00 5d c3 cc cc cc cc cc cc cc cc cc cc cc 0f 1f 44 00 00 55 48 89 e5 <41> 55 49 89 f5 41 54 53 48 89 fb 48 83 c7 10 e8 23 e0 36 00 48 8d
All code
========
   0:	be 80 01 00 00       	mov    $0x180,%esi
   5:	48 c7 c7 40 82 cd 82 	mov    $0xffffffff82cd8240,%rdi
   c:	48 89 e5             	mov    %rsp,%rbp
   f:	e8 7d 38 53 00       	call   0x533891
  14:	5d                   	pop    %rbp
  15:	c3                   	ret    
  16:	cc                   	int3   
  17:	cc                   	int3   
  18:	cc                   	int3   
  19:	cc                   	int3   
  1a:	cc                   	int3   
  1b:	cc                   	int3   
  1c:	cc                   	int3   
  1d:	cc                   	int3   
  1e:	cc                   	int3   
  1f:	cc                   	int3   
  20:	cc                   	int3   
  21:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
  26:	55                   	push   %rbp
  27:	48 89 e5             	mov    %rsp,%rbp
  2a:*	41 55                	push   %r13		<-- trapping instruction
  2c:	49 89 f5             	mov    %rsi,%r13
  2f:	41 54                	push   %r12
  31:	53                   	push   %rbx
  32:	48 89 fb             	mov    %rdi,%rbx
  35:	48 83 c7 10          	add    $0x10,%rdi
  39:	e8 23 e0 36 00       	call   0x36e061
  3e:	48                   	rex.W
  3f:	8d                   	.byte 0x8d

Code starting with the faulting instruction
===========================================
   0:	41 55                	push   %r13
   2:	49 89 f5             	mov    %rsi,%r13
   5:	41 54                	push   %r12
   7:	53                   	push   %rbx
   8:	48 89 fb             	mov    %rdi,%rbx
   b:	48 83 c7 10          	add    $0x10,%rdi
   f:	e8 23 e0 36 00       	call   0x36e037
  14:	48                   	rex.W
  15:	8d                   	.byte 0x8d
[   22.750370] RSP: 0018:ffff8881250674d0 EFLAGS: 00000286
[   22.752906] RAX: ffffffff81676155 RBX: ffffffff81269600 RCX: ffffffff810e2106
[   22.755572] RDX: dffffc0000000000 RSI: ffffffff81676155 RDI: ffff8881250675a8
[   22.758217] RBP: ffff8881250674d0 R08: ffffffff810e20d4 R09: ffff88812f1a4000
[   22.760877] R10: ffff8881250674e0 R11: 0000000000000001 R12: ffff8881250675a8
[   22.763513] R13: 0000000000000000 R14: ffff88812f1a4000 R15: ffff88810f33c180
_______________________________________________
Virtualization mailing list
Virtualization@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linuxfoundation.org/mailman/listinfo/virtualization




[Index of Archives]     [KVM Development]     [Libvirt Development]     [Libvirt Users]     [CentOS Virtualization]     [Netdev]     [Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux