During Crystal Ridge testing in KVM virtual machines we have occasionally seen "suspicious RCU usage" messages. They don't seem to be tied to any one specific test, and my guess is that we are occasionally just losing a race. In the instance for the traces below we end up tripping the warning in write_pmem(). At this point the test hung forever (long enough for me to give up), I think in a livelock. I did some "echo w > /proc/sysrq-trigger" runs on the guest to see if we had a deadlock, but each time the threads were in slightly different places, so it doesn't look like a good deadlock. I sent this to Dan Williams, and his assessment was: > Looks like a bad interaction with paravirtualization support, not > necessarily a bug in our code. Might be good to ping KVM folks about > their expectations of paravirt faults under pagefault_disable() (from > kmap_atomic()). Any ideas about what is going on? Here is the test setup, with v4.10-rc8 + xfs + dax + generic/270 run in a loop: FSTYP -- xfs (debug) PLATFORM -- Linux/x86_64 alara 4.10.0-rc8 MKFS_OPTIONS -- -f -bsize=4096 /dev/pmem0p2 MOUNT_OPTIONS -- -o dax -o context=system_u:object_r:nfs_t:s0 /dev/pmem0p2 /mnt/xfstests_scratch generic/270 24s Kernel log: [ 278.188356] run fstests generic/270 at 2017-02-15 09:44:38 [ 278.571958] XFS (pmem0p2): Unmounting Filesystem [ 278.638849] XFS (pmem0p2): DAX enabled. Warning: EXPERIMENTAL, use at your own risk [ 278.641349] XFS (pmem0p2): Mounting V5 Filesystem [ 278.644692] XFS (pmem0p2): Ending clean mount [ 278.646480] XFS (pmem0p2): Quotacheck needed: Please wait. [ 278.649404] XFS (pmem0p2): Quotacheck: Done. [ 278.750112] XFS (pmem0p2): xlog_verify_grant_tail: space > BBTOB(tail_blocks) [ 278.904390] [ 278.904967] =============================== [ 278.906319] [ INFO: suspicious RCU usage. ] [ 278.907184] 4.10.0-rc8 #1 Not tainted [ 278.907856] ------------------------------- [ 278.908598] kernel/rcu/tree.c:933 rcu_irq_enter() invoked with irqs enabled!!!! [ 278.909777] [ 278.909777] other info that might help us debug this: [ 278.909777] [ 278.910874] [ 278.910874] rcu_scheduler_active = 2, debug_locks = 0 [ 278.911747] 3 locks held by 1769.fsstress.b/2134: [ 278.912425] #0: (sb_writers#8){.+.+.+}, at: [<ffffffff812ddeb8>] mnt_want_write_file+0x28/0x60 [ 278.913622] #1: (&sb->s_type->i_mutex_key#12){++++++}, at: [<ffffffff81536e67>] xfs_ilock+0x1d7/0x300 [ 278.914995] #2: (&(&ip->i_mmaplock)->mr_lock){+++++.}, at: [<ffffffff81536e1a>] xfs_ilock+0x18a/0x300 [ 278.916961] [ 278.916961] stack backtrace: [ 278.917863] CPU: 6 PID: 2134 Comm: 1769.fsstress.b Not tainted 4.10.0-rc8 #1 [ 278.919277] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.1-0-gb3ef39f-prebuilt.qemu-project.org 04/01/2014 [ 278.921894] Call Trace: [ 278.922452] dump_stack+0x86/0xc3 [ 278.923103] lockdep_rcu_suspicious+0xd7/0x110 [ 278.924129] rcu_irq_enter+0x74/0x80 [ 278.924881] kvm_async_pf_task_wait+0x19f/0x240 [ 278.925814] do_async_page_fault+0x63/0x90 [ 278.926637] async_page_fault+0x28/0x30 [ 278.927439] RIP: 0010:__copy_user_nocache+0x41/0xf0 [ 278.928455] RSP: 0018:ffffc9000982f748 EFLAGS: 00010202 [ 278.929544] RAX: ffff8800b14f0000 RBX: ffff8800027a5000 RCX: 0000000000000040 [ 278.931054] RDX: 0000000000000000 RSI: ffff8800027a5000 RDI: ffff88041ae89000 [ 278.932173] RBP: ffffc9000982f788 R08: 0000000000000000 R09: 0000000000000000 [ 278.933313] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800b14f0000 [ 278.934466] R13: ffff88041ae89000 R14: 0000000000001000 R15: 0000000000000000 [ 278.935437] ? write_pmem+0x78/0x100 [nd_pmem] [ 278.936039] ? debug_smp_processor_id+0x17/0x20 [ 278.936647] pmem_do_bvec+0x64/0x2d0 [nd_pmem] [ 278.937249] ? blk_queue_enter+0x41/0x2d0 [ 278.937788] pmem_make_request+0xf7/0x260 [nd_pmem] [ 278.938472] ? generic_make_request+0xf1/0x2e0 [ 278.939071] generic_make_request+0xfe/0x2e0 [ 278.939649] ? trace_hardirqs_on_caller+0xf5/0x1b0 [ 278.940295] submit_bio+0x73/0x150 [ 278.941372] ? __init_waitqueue_head+0x3b/0x50 [ 278.942652] submit_bio_wait+0x60/0x90 [ 278.943366] blkdev_issue_zeroout+0xf1/0x140 [ 278.944241] __dax_zero_page_range+0x9c/0x150 [ 278.945101] iomap_zero_range_actor+0x8c/0x290 [ 278.946255] ? iomap_fiemap_actor+0x80/0x80 [ 278.947382] iomap_apply+0xb1/0x130 [ 278.948397] ? iomap_fiemap_actor+0x80/0x80 [ 278.949602] iomap_zero_range+0x5d/0x90 [ 278.950700] ? iomap_fiemap_actor+0x80/0x80 [ 278.951902] xfs_zero_eof+0x6e/0x1c0 [ 278.952926] xfs_setattr_size+0x25c/0x3b0 [ 278.954076] xfs_vn_setattr_size+0x57/0x160 [ 278.955269] xfs_ioc_space+0x30f/0x3f0 [ 278.956341] ? __might_fault+0x8c/0xa0 [ 278.957271] ? __might_fault+0x43/0xa0 [ 278.957928] xfs_file_ioctl+0xbc2/0xd60 [ 278.958451] ? __might_sleep+0x4a/0x80 [ 278.959450] do_vfs_ioctl+0xa3/0x730 [ 278.960396] SyS_ioctl+0x79/0x90 [ 278.961284] entry_SYSCALL_64_fastpath+0x1f/0xc2 [ 278.962404] RIP: 0033:0x7fa24537cce7 [ 278.963427] RSP: 002b:00007ffcbe559368 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 278.965593] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fa24537cce7 [ 278.967641] RDX: 00007ffcbe559390 RSI: 0000000040305825 RDI: 0000000000000003 [ 278.969584] RBP: 000000000000001b R08: 000000000000007d R09: 00007ffcbe55937c [ 278.971587] R10: 0000000000000078 R11: 0000000000000246 R12: 0000000000000000 [ 278.973632] R13: 0000000000000003 R14: 0000000000000000 R15: 0000000000000000 [ 300.251989] 270 (1769): drop_caches: 3 and here are the "echo w > /proc/sysrq-trigger" dumps, which all appear to be a little different: First trace: [ 485.864822] sysrq: SysRq : Show Blocked State [ 485.867535] task PC stack pid father [ 485.871276] systemd-journal D 0 653 1 0x00000080 [ 485.873899] Call Trace: [ 485.875213] __schedule+0x2d5/0xab0 [ 485.876972] schedule+0x40/0x90 [ 485.878640] schedule_timeout+0x334/0x690 [ 485.880564] ? __delayacct_blkio_start+0x1f/0x30 [ 485.882622] ? kvm_clock_read+0x23/0x40 [ 485.884314] ? kvm_clock_get_cycles+0x9/0x10 [ 485.886066] ? ktime_get+0xb3/0x150 [ 485.887929] ? __delayacct_blkio_start+0x1f/0x30 [ 485.890044] io_schedule_timeout+0xa4/0x110 [ 485.892461] ? trace_hardirqs_on+0xd/0x10 [ 485.894882] __lock_page_or_retry+0x1e5/0x350 [ 485.896531] ? find_get_entry+0x5/0x2a0 [ 485.897999] ? unlock_page+0x50/0x50 [ 485.899809] filemap_fault+0x58d/0xa80 [ 485.901569] ? filemap_fault+0x7e4/0xa80 [ 485.903471] ? xfs_ilock+0x209/0x300 [ 485.905116] ? __schedule+0x2d5/0xab0 [ 485.906722] ? schedule+0x40/0x90 [ 485.908307] ? __do_fault+0x23/0x130 [ 485.909960] ? ep_poll+0xe9/0x3a0 [ 485.911915] ? schedule_hrtimeout_range+0x13/0x20 [ 485.913552] ? ep_poll+0x28a/0x3a0 [ 485.915018] ? wake_up_q+0x80/0x80 [ 485.916416] ? trace_do_page_fault+0x58/0x2c0 [ 485.918159] ? do_syscall_64+0x6c/0x1f0 [ 485.920027] ? entry_SYSCALL64_slow_path+0x25/0x25 [ 485.922479] xfs_db D 0 2308 2306 0x00000000 [ 485.924935] Call Trace: [ 485.926032] __schedule+0x2d5/0xab0 [ 485.927575] schedule+0x40/0x90 [ 485.928885] schedule_timeout+0x334/0x690 [ 485.930427] ? swap_cgroup_record+0x7a/0xa0 [ 485.931217] ? mem_cgroup_uncharge_swap+0x10/0x150 [ 485.932673] ? io_schedule_timeout+0xa4/0x110 [ 485.934547] ? trace_hardirqs_on+0xd/0x10 [ 485.936367] ? __lock_page_or_retry+0x1e5/0x350 [ 485.938351] ? do_swap_page+0x461/0x8c0 [ 485.940074] ? unlock_page+0x50/0x50 [ 485.941655] ? do_swap_page+0x2dd/0x8c0 [ 485.943418] ? handle_mm_fault+0x5a/0x16e0 [ 485.945656] ? handle_mm_fault+0xa31/0x16e0 [ 485.947557] ? handle_mm_fault+0x5a/0x16e0 [ 485.949315] ? __do_page_fault+0x13f/0x590 [ 485.951211] ? __do_page_fault+0x28f/0x590 [ 485.953231] ? trace_do_page_fault+0x58/0x2c0 [ 485.955368] ? retint_user+0x18/0x20 Second trace: [ 509.841216] sysrq: SysRq : Show Blocked State [ 509.841931] task PC stack pid father [ 509.842967] systemd-journal D 0 653 1 0x00000080 [ 509.843926] Call Trace: [ 509.844414] __schedule+0x2d5/0xab0 [ 509.845021] ? set_next_entity+0x104/0x300 [ 509.845843] ? pick_next_task_fair+0x751/0x790 [ 509.846928] ? __schedule+0x2d5/0xab0 [ 509.847661] ? schedule+0x40/0x90 [ 509.848854] ? schedule_timeout+0x334/0x690 [ 509.849589] ? __delayacct_blkio_start+0x1f/0x30 [ 509.850376] ? kvm_clock_read+0x23/0x40 [ 509.851034] ? kvm_clock_get_cycles+0x9/0x10 [ 509.851949] ? filemap_fault+0x58d/0xa80 [ 509.852627] ? io_schedule_timeout+0xa4/0x110 [ 509.853373] ? trace_hardirqs_on+0xd/0x10 [ 509.854060] ? do_async_page_fault+0x2c/0x90 [ 509.854788] ? __lock_page_or_retry+0x1e5/0x350 [ 509.855595] ? unlock_page+0x50/0x50 [ 509.856257] ? do_swap_page+0x4bb/0x8c0 [ 509.856946] ? do_swap_page+0x2fe/0x8c0 [ 509.857657] ? __do_fault+0x23/0x130 [ 509.858320] ? handle_mm_fault+0xa31/0x16e0 [ 509.859032] ? handle_mm_fault+0x5a/0x16e0 [ 509.859893] ? __do_page_fault+0x28f/0x590 [ 509.860641] ? trace_do_page_fault+0x58/0x2c0 [ 509.861422] ? do_async_page_fault+0x2c/0x90 [ 509.862185] ? async_page_fault+0x28/0x30 [ 509.862970] xfs_db D 0 2308 2306 0x00000000 [ 509.864322] Call Trace: [ 509.865182] __schedule+0x2d5/0xab0 [ 509.866337] schedule+0x40/0x90 [ 509.867396] schedule_timeout+0x334/0x690 [ 509.868728] ? __delayacct_blkio_start+0x1f/0x30 [ 509.870284] ? kvm_clock_read+0x23/0x40 [ 509.871597] ? kvm_clock_get_cycles+0x9/0x10 [ 509.873000] ? ktime_get+0xb3/0x150 [ 509.874160] ? __delayacct_blkio_start+0x1f/0x30 [ 509.875660] io_schedule_timeout+0xa4/0x110 [ 509.877045] ? trace_hardirqs_on+0xd/0x10 [ 509.878366] __lock_page_or_retry+0x1e5/0x350 [ 509.879166] ? unlock_page+0x50/0x50 [ 509.879807] do_swap_page+0x4bb/0x8c0 [ 509.880525] ? do_swap_page+0x2fe/0x8c0 [ 509.881579] handle_mm_fault+0xa31/0x16e0 [ 509.882275] ? handle_mm_fault+0x5a/0x16e0 [ 509.883006] __do_page_fault+0x28f/0x590 [ 509.883702] trace_do_page_fault+0x58/0x2c0 [ 509.884494] do_async_page_fault+0x2c/0x90 [ 509.885225] async_page_fault+0x28/0x30 [ 509.885914] RIP: 0033:0x40267c [ 509.886504] RSP: 002b:00007ffef2333320 EFLAGS: 00010206 [ 509.887410] RAX: 000000001200f000 RBX: 0000000000000000 RCX: 0000000000000000 [ 509.888573] RDX: 0000000000000000 RSI: 0000000000000014 RDI: 0000000001063030 [ 509.889749] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000 [ 509.890919] R10: 0000000000000002 R11: 0000000000000246 R12: 0000000000000001 [ 509.892091] R13: 0000000000000000 R14: 00000000010630d0 R15: 00007ffef2333558 Third trace: [ 561.808275] systemd-journald[653]: Sent WATCHDOG=1 notification. [ 562.952553] sysrq: SysRq : Show Blocked State [ 562.954143] task PC stack pid father [ 562.957149] systemd-journal D 0 653 1 0x00000080 [ 562.959551] Call Trace: [ 562.960718] __schedule+0x2d5/0xab0 [ 562.962441] schedule+0x40/0x90 [ 562.964260] schedule_timeout+0x334/0x690 [ 562.965924] ? __delayacct_blkio_start+0x1f/0x30 [ 562.967798] ? kvm_clock_read+0x23/0x40 [ 562.969382] ? kvm_clock_get_cycles+0x9/0x10 [ 562.971392] ? ktime_get+0xb3/0x150 [ 562.973036] ? __delayacct_blkio_start+0x1f/0x30 [ 562.975395] io_schedule_timeout+0xa4/0x110 [ 562.977304] ? trace_hardirqs_on+0xd/0x10 [ 562.979349] __lock_page_or_retry+0x1e5/0x350 [ 562.981333] ? find_get_entry+0x5/0x2a0 [ 562.982776] ? unlock_page+0x50/0x50 [ 562.984267] filemap_fault+0x58d/0xa80 [ 562.985550] ? filemap_fault+0x7e4/0xa80 [ 562.986514] ? xfs_ilock+0x209/0x300 [ 562.987789] ? xfs_filemap_fault+0x65/0x1d0 [ 562.991690] ? down_read_nested+0x76/0xb0 [ 562.993325] ? xfs_ilock+0x209/0x300 [ 562.994491] xfs_filemap_fault+0x7b/0x1d0 [ 562.996194] __do_fault+0x23/0x130 [ 562.997524] handle_mm_fault+0xf39/0x16e0 [ 562.999146] ? handle_mm_fault+0x5a/0x16e0 [ 563.000298] __do_page_fault+0x28f/0x590 [ 563.002021] trace_do_page_fault+0x58/0x2c0 [ 563.003884] do_async_page_fault+0x2c/0x90 [ 563.005618] async_page_fault+0x28/0x30 [ 563.007591] RIP: 0033:0x5599ee40db47 [ 563.008351] RSP: 002b:00007ffcd950e2b0 EFLAGS: 00010206 [ 563.009880] RAX: 00001b11af82149c RBX: 00005599eec9ab70 RCX: 603ec9bec9b93394 [ 563.012267] RDX: 00007f757dbee630 RSI: 00005599eec9ee50 RDI: 00005599eec9ab70 [ 563.015045] RBP: 0000000000ab8cb8 R08: 0000000000ab8cc8 R09: 00005599eec9da20 [ 563.017843] R10: 00005599ee43fe1c R11: 0000000000000001 R12: 00005599eec9ab70 [ 563.020723] R13: 0000000000000022 R14: 00005599eeca4da0 R15: 00007ffcd950e390 [ 563.023435] xfs_db D 0 2308 2306 0x00000000 [ 563.025872] Call Trace: [ 563.027186] __schedule+0x2d5/0xab0 [ 563.028596] schedule+0x40/0x90 [ 563.029720] ? delayacct_end+0x3e/0x60 [ 563.031244] ? __delayacct_blkio_start+0x1f/0x30 [ 563.032929] ? kvm_clock_read+0x23/0x40 [ 563.034275] ? mem_cgroup_uncharge_swap+0x10/0x150 [ 563.036235] ? read_swap_cache_async+0x26/0x60 [ 563.038134] ? swapin_readahead+0x1af/0x220 [ 563.039657] ? find_get_entry+0x1b6/0x2a0 [ 563.041205] ? unlock_page+0x50/0x50 [ 563.043030] ? do_swap_page+0x2dd/0x8c0 [ 563.044118] ? do_swap_page+0x2fe/0x8c0 [ 563.044830] ? handle_mm_fault+0xa31/0x16e0 [ 563.045961] ? handle_mm_fault+0x5a/0x16e0 [ 563.046994] ? __do_page_fault+0x28f/0x590 [ 563.048175] ? trace_do_page_fault+0x58/0x2c0 [ 563.049194] ? do_async_page_fault+0x2c/0x90 [ 563.050378] ? async_page_fault+0x28/0x30 Fourth trace: [ 609.745471] sysrq: SysRq : Show Blocked State [ 609.747748] task PC stack pid father [ 609.748789] systemd-journal D 0 653 1 0x00000080 [ 609.751804] Call Trace: [ 609.753311] __schedule+0x2d5/0xab0 [ 609.755151] schedule+0x40/0x90 [ 609.756573] schedule_timeout+0x334/0x690 [ 609.757674] ? __schedule+0x2d5/0xab0 [ 609.758452] ? generic_make_request+0xf1/0x2e0 [ 609.759149] ? cpuacct_charge+0x37/0x1f0 [ 609.759753] ? update_load_avg+0x7a/0x520 [ 609.760501] ? update_load_avg+0x7a/0x520 [ 609.761250] ? set_next_entity+0x104/0x300 [ 609.761940] ? pick_next_task_fair+0x751/0x790 [ 609.763999] ? __schedule+0x2d5/0xab0 [ 609.765146] ? trace_hardirqs_on+0xd/0x10 [ 609.766553] ? schedule+0x40/0x90 [ 609.783526] ? _xfs_log_force_lsn+0x323/0x3a0 [ 609.792476] ? wake_up_q+0x80/0x80 [ 609.795954] ? xfs_file_fsync+0x114/0x2c0 [ 609.798646] ? vfs_fsync_range+0x4b/0xb0 [ 609.799554] ? do_fsync+0x3d/0x70 [ 609.800307] ? SyS_fsync+0x10/0x20 [ 609.802707] ? do_syscall_64+0x6c/0x1f0 [ 609.804731] ? entry_SYSCALL64_slow_path+0x25/0x25 [ 609.816389] xfs_db D 0 2308 2306 0x00000000 [ 609.817768] Call Trace: [ 609.818379] __schedule+0x2d5/0xab0 [ 609.819687] schedule+0x40/0x90 [ 609.820478] schedule_timeout+0x334/0x690 [ 609.821556] ? __delayacct_blkio_start+0x1f/0x30 [ 609.822648] ? kvm_clock_read+0x23/0x40 [ 609.823498] ? kvm_clock_get_cycles+0x9/0x10 [ 609.824543] ? ktime_get+0xb3/0x150 [ 609.825408] ? __delayacct_blkio_start+0x1f/0x30 [ 609.826639] io_schedule_timeout+0xa4/0x110 [ 609.827703] ? trace_hardirqs_on+0xd/0x10 [ 609.829018] __lock_page_or_retry+0x1e5/0x350 [ 609.831220] ? unlock_page+0x50/0x50 [ 609.833020] do_swap_page+0x4bb/0x8c0 [ 609.834872] ? do_swap_page+0x2fe/0x8c0 [ 609.836832] handle_mm_fault+0xa31/0x16e0 [ 609.838632] ? handle_mm_fault+0x5a/0x16e0 [ 609.839577] __do_page_fault+0x28f/0x590 [ 609.840498] trace_do_page_fault+0x58/0x2c0 [ 609.841492] do_async_page_fault+0x2c/0x90 [ 609.842435] async_page_fault+0x28/0x30 [ 609.843299] RIP: 0033:0x466196 [ 609.844835] RSP: 002b:00007ffef23334e0 EFLAGS: 00010287 [ 609.846395] RAX: 0000000000000001 RBX: 00000000010452e0 RCX: 0000000000000000 [ 609.847741] RDX: 00000000010441c0 RSI: 00007ffef2333570 RDI: 000000003a068050 [ 609.850063] RBP: 0000000001047480 R08: 0000000001047498 R09: 0000000000000000 [ 609.851833] R10: 0000000000000000 R11: 000000000000471c R12: 000000003a17f050 [ 609.854577] R13: 00007ffef2333570 R14: 0000000001047498 R15: 000000003a068050 [ 617.144162] systemd-journald[653]: Successfully sent stream file descriptor to service manager. [ 671.806678] systemd-journald[653]: Sent WATCHDOG=1 notification.