On Mon, Aug 6, 2018 at 8:13 PM Ilya Dryomov <idryomov@xxxxxxxxx> wrote: > > On Thu, Jul 26, 2018 at 1:55 AM Alex Gorbachev <ag@xxxxxxxxxxxxxxxxxxx> wrote: > > > > On Wed, Jul 25, 2018 at 7:07 PM, Alex Gorbachev <ag@xxxxxxxxxxxxxxxxxxx> wrote: > > > On Wed, Jul 25, 2018 at 6:07 PM, Alex Gorbachev <ag@xxxxxxxxxxxxxxxxxxx> wrote: > > >> On Wed, Jul 25, 2018 at 5:51 PM, Jason Dillaman <jdillama@xxxxxxxxxx> wrote: > > >>> > > >>> > > >>> On Wed, Jul 25, 2018 at 5:41 PM Alex Gorbachev <ag@xxxxxxxxxxxxxxxxxxx> > > >>> wrote: > > >>>> > > >>>> I am not sure this related to RBD, but in case it is, this would be an > > >>>> important bug to fix. > > >>>> > > >>>> Running LVM on top of RBD, XFS filesystem on top of that, consumed in RHEL > > >>>> 7.4. > > >>>> > > >>>> When running a large read operation and doing LVM snapshots during > > >>>> that operation, the block being read winds up all zeroes in pagecache. > > >>>> > > >>>> Dropping the caches syncs up the block with what's on "disk" and > > >>>> everything is fine. > > >>>> > > >>>> Working on steps to reproduce simply - ceph is Luminous 12.2.7, RHEL > > >>>> client is Jewel 10.2.10-17.el7cp > > >>> > > >>> > > >>> Is this krbd or QEMU+librbd? If the former, what kernel version are you > > >>> running? > > >> > > >> It's krbd on RHEL. > > >> > > >> RHEL kernel: > > >> > > >> Linux dmg-cbcache01 3.10.0-862.el7.x86_64 #1 SMP Wed Mar 21 18:14:51 > > >> EDT 2018 x86_64 x86_64 x86_64 GNU/Linux > > > > > > Not sure if this is exactly replicating the issue, but I was able to > > > do this on two different systems: > > > > > > RHEL 7.4 kernel as above. > > > > > > Create a PVM PV on a mapped kRBD device > > > > > > example: pvcreate /dev/rbd/spin1/lvm1 > > > > > > Create a VG and LV, make an XFS FS > > > > > > vgcreate datavg /dev/rbd/spin1/lvm1 > > > lvcreate -n data1 -L 5G datavg > > > mkfs.xfs /dev/datavg/data1 > > > <mount on something> > > > > > > Get some large file and copy it to some other file, same storage or > > > different. All is well. > > > > > > Now snapshot the LV > > > > > > lvcreate -l8%ORIGIN -s -n snap_data1 /dev/datavg/data1 --addtag backup > > > > > > Now try to copy that file again. I get: > > > > > > NMI watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [kworker/2:1:3470] > > > > > > And in dmesg (this is on Proxmox but I did the same on ESXi) > > > > > > [1397609.308673] sched: RT throttling activated > > > [1397658.759259] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 23s! > > > [kworker/0:1:2648] > > > [1397658.759354] Modules linked in: dm_snapshot dm_bufio rbd libceph > > > rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache > > > sunrpc ppdev joydev pcspkr sg parport_pc virtio_balloon parport shpchp > > > i2c_piix4 ip_tables xfs libcrc32c sd_mod sr_mod crc_t10dif > > > crct10dif_generic cdrom crct10dif_common ata_generic pata_acpi > > > virtio_scsi virtio_console virtio_net bochs_drm drm_kms_helper > > > syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm ata_piix libata > > > serio_raw virtio_pci i2c_core virtio_ring virtio floppy dm_mirror > > > dm_region_hash dm_log dm_mod > > > [1397658.759400] CPU: 0 PID: 2648 Comm: kworker/0:1 Kdump: loaded Not > > > tainted 3.10.0-862.el7.x86_64 #1 > > > [1397658.759402] Hardware name: QEMU Standard PC (i440FX + PIIX, > > > 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org > > > 04/01/2014 > > > [1397658.759415] Workqueue: kcopyd do_work [dm_mod] > > > [1397658.759418] task: ffff932df65d3f40 ti: ffff932fb138c000 task.ti: > > > ffff932fb138c000 > > > [1397658.759420] RIP: 0010:[<ffffffffc08535a0>] [<ffffffffc08535a0>] > > > copy_callback+0x50/0x130 [dm_snapshot] > > > [1397658.759426] RSP: 0018:ffff932fb138fd08 EFLAGS: 00000283 > > > [1397658.759428] RAX: 000000000003e5e8 RBX: ffffebecc4943ec0 RCX: > > > ffff932ff4704068 > > > [1397658.759430] RDX: ffff932dc8050d00 RSI: ffff932fd6a0f9b8 RDI: > > > 0000000000000000 > > > [1397658.759431] RBP: ffff932fb138fd28 R08: ffff932dc7d2c0b0 R09: > > > ffff932dc8050d20 > > > [1397658.759433] R10: 00000000c7d2b301 R11: ffffebecc01f4a00 R12: > > > 0000000000000000 > > > [1397658.759435] R13: 0000000180090003 R14: 0000000000000000 R15: > > > 00000000ffffff80 > > > [1397658.759438] FS: 0000000000000000(0000) GS:ffff932fffc00000(0000) > > > knlGS:0000000000000000 > > > [1397658.759440] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > > > [1397658.759442] CR2: 00007f17bcd5e860 CR3: 0000000042c0e000 CR4: > > > 00000000000006f0 > > > [1397658.759447] Call Trace: > > > [1397658.759452] [<ffffffffc0853550>] ? origin_resume+0x70/0x70 [dm_snapshot] > > > [1397658.759459] [<ffffffffc0340eab>] run_complete_job+0x6b/0xc0 [dm_mod] > > > [1397658.759466] [<ffffffffc0340f60>] process_jobs+0x60/0x100 [dm_mod] > > > [1397658.759471] [<ffffffffc0340e40>] ? kcopyd_put_pages+0x50/0x50 [dm_mod] > > > [1397658.759477] [<ffffffffc0341042>] do_work+0x42/0x90 [dm_mod] > > > [1397658.759483] [<ffffffffa08b2dff>] process_one_work+0x17f/0x440 > > > [1397658.759485] [<ffffffffa08b3bcc>] worker_thread+0x22c/0x3c0 > > > [1397658.759489] [<ffffffffa08b39a0>] ? manage_workers.isra.24+0x2a0/0x2a0 > > > [1397658.759494] [<ffffffffa08bae31>] kthread+0xd1/0xe0 > > > [1397658.759497] [<ffffffffa08bad60>] ? insert_kthread_work+0x40/0x40 > > > [1397658.759503] [<ffffffffa0f1f637>] ret_from_fork_nospec_begin+0x21/0x21 > > > [1397658.759506] [<ffffffffa08bad60>] ? insert_kthread_work+0x40/0x40 > > > > > > > > > > > > Tried same on Ubuntu kernel 4.14.39 - no issues > > I reproduced multiple soft lockups in copy_callback() on ceph-client > testing branch (based on 4.18-rc7) and on 4.14.39 -- wanted to confirm > it wasn't a regression. There is plenty of reports of kcopyd lockups > in various places, here is one in copy_callback() on a disk array: > > https://www.mail-archive.com/dm-devel@xxxxxxxxxx/msg09077.html > > Why are you using dm-snapshot snapshots? The performance is not great, > and they are pretty inflexible. Both lvm and device mapper folks steer > people towards dm-thin instead, I believe. And of course, rbd supports > native snapshots! Attaching dmesg for reference. Thanks, Ilya
[ 95.015411] XFS (dm-0): Mounting V5 Filesystem [ 95.133392] XFS (dm-0): Ending clean mount [ 316.204666] watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [kworker/0:0:3] [ 316.212149] Modules linked in: dm_snapshot dm_bufio xfs rbd libceph libcrc32c gpio_ich intel_powerclamp coretemp kvm_intel kvm irqbypass ipmi_si joydev serio_raw lpc_ich ipmi_devintf i7core_edac shpchp ipmi_msghandler ib_iser rdma_cm iw_cm ib_cm ib_core configfs nfsd auth_rpcgss iscsi_tcp nfs_acl lockd libiscsi_tcp grace libiscsi scsi_transport_iscsi sunrpc lp parport autofs4 btrfs xor zstd_decompress zstd_compress xxhash raid6_pq hid_generic usbhid hid e1000e ahci ptp psmouse libahci pps_core arcmsr [ 316.212221] irq event stamp: 7006442 [ 316.212226] hardirqs last enabled at (7006441): [<ffffffffb924c1bc>] kmem_cache_free+0xac/0x2f0 [ 316.212230] hardirqs last disabled at (7006442): [<ffffffffb9a00942>] common_interrupt+0x82/0x87 [ 316.212233] softirqs last enabled at (6475394): [<ffffffffb9c00393>] __do_softirq+0x393/0x4a6 [ 316.212237] softirqs last disabled at (6475361): [<ffffffffb908c3a1>] irq_exit+0xc1/0xd0 [ 316.212240] CPU: 0 PID: 3 Comm: kworker/0:0 Not tainted 4.14.39-ceph-g7d6240f0fb85 #1 [ 316.212241] Hardware name: Supermicro X8SIL/X8SIL, BIOS 1.0c 02/25/2010 [ 316.212245] Workqueue: kcopyd do_work [ 316.212248] task: ffff8848ad63ca80 task.stack: ffff9891418bc000 [ 316.212253] RIP: 0010:copy_callback+0x47/0x190 [dm_snapshot] [ 316.212255] RSP: 0018:ffff9891418bfd40 EFLAGS: 00000293 ORIG_RAX: ffffffffffffffbe [ 316.212258] RAX: ffff8845e5d65d00 RBX: ffff8848a72e64c8 RCX: 000000000000000f [ 316.212259] RDX: 00000000000164d9 RSI: 0000000000000000 RDI: 0000000000000000 [ 316.212261] RBP: ffff8845e30137f8 R08: 00000000e3130001 R09: 0000000180070005 [ 316.212262] R10: ffff9891418bfc70 R11: 0000000000000000 R12: ffffffffc0685550 [ 316.212264] R13: 0000000000000000 R14: 0000000000000000 R15: ffff8845e30137f8 [ 316.212266] FS: 0000000000000000(0000) GS:ffff8848bfc00000(0000) knlGS:0000000000000000 [ 316.212268] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 316.212269] CR2: 00005632eacc1f50 CR3: 0000000238210000 CR4: 00000000000006f0 [ 316.212271] Call Trace: [ 316.212278] ? origin_resume+0x80/0x80 [dm_snapshot] [ 316.212281] run_complete_job+0x5f/0xb0 [ 316.212285] process_jobs+0x97/0x210 [ 316.212288] ? drop_pages+0x40/0x40 [ 316.212292] do_work+0x45/0x90 [ 316.212300] ? process_one_work+0x1cf/0x610 [ 316.212301] ? process_jobs+0x210/0x210 [ 316.212303] process_one_work+0x1cf/0x610 [ 316.212310] worker_thread+0x204/0x3e0 [ 316.212316] kthread+0x10a/0x140 [ 316.212318] ? process_one_work+0x610/0x610 [ 316.212320] ? kthread_stop+0x290/0x290 [ 316.212324] ret_from_fork+0x3a/0x50 [ 316.212333] Code: c0 09 d0 48 8b 55 50 0f b6 c0 89 45 4c 48 3b 93 c0 00 00 00 74 56 48 8b 83 d0 00 00 00 48 81 c3 c8 00 00 00 48 39 d8 75 0b eb 12 <48> 8b 40 08 48 39 d8 74 09 48 3b 50 f8 76 f1 48 89 c3 4c 8b 2b [ 404.203743] watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [kworker/0:0:3] [ 404.211066] Modules linked in: dm_snapshot dm_bufio xfs rbd libceph libcrc32c gpio_ich intel_powerclamp coretemp kvm_intel kvm irqbypass ipmi_si joydev serio_raw lpc_ich ipmi_devintf i7core_edac shpchp ipmi_msghandler ib_iser rdma_cm iw_cm ib_cm ib_core configfs nfsd auth_rpcgss iscsi_tcp nfs_acl lockd libiscsi_tcp grace libiscsi scsi_transport_iscsi sunrpc lp parport autofs4 btrfs xor zstd_decompress zstd_compress xxhash raid6_pq hid_generic usbhid hid e1000e ahci ptp psmouse libahci pps_core arcmsr [ 404.211138] irq event stamp: 18021984 [ 404.211144] hardirqs last enabled at (18021983): [<ffffffffb924c1bc>] kmem_cache_free+0xac/0x2f0 [ 404.211148] hardirqs last disabled at (18021984): [<ffffffffb9a00942>] common_interrupt+0x82/0x87 [ 404.211151] softirqs last enabled at (18021942): [<ffffffffb9c00393>] __do_softirq+0x393/0x4a6 [ 404.211155] softirqs last disabled at (18021935): [<ffffffffb908c3a1>] irq_exit+0xc1/0xd0 [ 404.211158] CPU: 0 PID: 3 Comm: kworker/0:0 Tainted: G L 4.14.39-ceph-g7d6240f0fb85 #1 [ 404.211159] Hardware name: Supermicro X8SIL/X8SIL, BIOS 1.0c 02/25/2010 [ 404.211163] Workqueue: kcopyd do_work [ 404.211166] task: ffff8848ad63ca80 task.stack: ffff9891418bc000 [ 404.211171] RIP: 0010:copy_callback+0x47/0x190 [dm_snapshot] [ 404.211173] RSP: 0018:ffff9891418bfd40 EFLAGS: 00000287 ORIG_RAX: ffffffffffffffbe [ 404.211176] RAX: ffff8844ac333fd0 RBX: ffff8848a72e64c8 RCX: 000000000000000f [ 404.211178] RDX: 000000000004739d RSI: 0000000000000000 RDI: 0000000000000000 [ 404.211179] RBP: ffff88450b2acd98 R08: 00000000d0e89001 R09: 0000000180070004 [ 404.211181] R10: ffff9891418bfc70 R11: 0000000000000000 R12: ffffffffc0685550 [ 404.211182] R13: 0000000000000000 R14: 0000000000000000 R15: ffff88450b2acd98 [ 404.211184] FS: 0000000000000000(0000) GS:ffff8848bfc00000(0000) knlGS:0000000000000000 [ 404.211186] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 404.211188] CR2: 0000560624729220 CR3: 0000000238210000 CR4: 00000000000006f0 [ 404.211189] Call Trace: [ 404.211196] ? origin_resume+0x80/0x80 [dm_snapshot] [ 404.211199] run_complete_job+0x5f/0xb0 [ 404.211203] process_jobs+0x97/0x210 [ 404.211206] ? drop_pages+0x40/0x40 [ 404.211210] do_work+0x45/0x90 [ 404.211218] ? process_one_work+0x1cf/0x610 [ 404.211220] ? process_jobs+0x210/0x210 [ 404.211222] process_one_work+0x1cf/0x610 [ 404.211228] worker_thread+0x204/0x3e0 [ 404.211234] kthread+0x10a/0x140 [ 404.211237] ? process_one_work+0x610/0x610 [ 404.211238] ? kthread_stop+0x290/0x290 [ 404.211242] ret_from_fork+0x3a/0x50 [ 404.211251] Code: c0 09 d0 48 8b 55 50 0f b6 c0 89 45 4c 48 3b 93 c0 00 00 00 74 56 48 8b 83 d0 00 00 00 48 81 c3 c8 00 00 00 48 39 d8 75 0b eb 12 <48> 8b 40 08 48 39 d8 74 09 48 3b 50 f8 76 f1 48 89 c3 4c 8b 2b [ 431.696810] rbd: rbd0: encountered watch error: -107 [ 432.203386] watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [kworker/0:0:3] [ 432.210650] Modules linked in: dm_snapshot dm_bufio xfs rbd libceph libcrc32c gpio_ich intel_powerclamp coretemp kvm_intel kvm irqbypass ipmi_si joydev serio_raw lpc_ich ipmi_devintf i7core_edac shpchp ipmi_msghandler ib_iser rdma_cm iw_cm ib_cm ib_core configfs nfsd auth_rpcgss iscsi_tcp nfs_acl lockd libiscsi_tcp grace libiscsi scsi_transport_iscsi sunrpc lp parport autofs4 btrfs xor zstd_decompress zstd_compress xxhash raid6_pq hid_generic usbhid hid e1000e ahci ptp psmouse libahci pps_core arcmsr [ 432.210722] irq event stamp: 18415914 [ 432.210727] hardirqs last enabled at (18415913): [<ffffffffb924c1bc>] kmem_cache_free+0xac/0x2f0 [ 432.210731] hardirqs last disabled at (18415914): [<ffffffffb9a00942>] common_interrupt+0x82/0x87 [ 432.210734] softirqs last enabled at (18352076): [<ffffffffb9c00393>] __do_softirq+0x393/0x4a6 [ 432.210738] softirqs last disabled at (18352043): [<ffffffffb908c3a1>] irq_exit+0xc1/0xd0 [ 432.210741] CPU: 0 PID: 3 Comm: kworker/0:0 Tainted: G L 4.14.39-ceph-g7d6240f0fb85 #1 [ 432.210742] Hardware name: Supermicro X8SIL/X8SIL, BIOS 1.0c 02/25/2010 [ 432.210746] Workqueue: kcopyd do_work [ 432.210749] task: ffff8848ad63ca80 task.stack: ffff9891418bc000 [ 432.210754] RIP: 0010:copy_callback+0x47/0x190 [dm_snapshot] [ 432.210756] RSP: 0018:ffff9891418bfd40 EFLAGS: 00000293 ORIG_RAX: ffffffffffffffbe [ 432.210759] RAX: ffff88450028bc88 RBX: ffff8848a72e64c8 RCX: 000000000000000f [ 432.210760] RDX: 0000000000041744 RSI: 0000000000000000 RDI: 0000000000000000 [ 432.210762] RBP: ffff8844d8f273c0 R08: 00000000d8ffc101 R09: 0000000180070001 [ 432.210764] R10: ffff9891418bfc70 R11: 0000000000000000 R12: ffffffffc0685550 [ 432.210765] R13: 0000000000000000 R14: 0000000000000000 R15: ffff8844d8f273c0 [ 432.210767] FS: 0000000000000000(0000) GS:ffff8848bfc00000(0000) knlGS:0000000000000000 [ 432.210769] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 432.210770] CR2: 0000560624729220 CR3: 0000000238210000 CR4: 00000000000006f0 [ 432.210772] Call Trace: [ 432.210779] ? origin_resume+0x80/0x80 [dm_snapshot] [ 432.210781] run_complete_job+0x5f/0xb0 [ 432.210786] process_jobs+0x97/0x210 [ 432.210788] ? drop_pages+0x40/0x40 [ 432.210793] do_work+0x45/0x90 [ 432.210800] ? process_one_work+0x1cf/0x610 [ 432.210802] ? process_jobs+0x210/0x210 [ 432.210804] process_one_work+0x1cf/0x610 [ 432.210811] worker_thread+0x204/0x3e0 [ 432.210817] kthread+0x10a/0x140 [ 432.210819] ? process_one_work+0x610/0x610 [ 432.210821] ? kthread_stop+0x290/0x290 [ 432.210825] ret_from_fork+0x3a/0x50 [ 432.210834] Code: c0 09 d0 48 8b 55 50 0f b6 c0 89 45 4c 48 3b 93 c0 00 00 00 74 56 48 8b 83 d0 00 00 00 48 81 c3 c8 00 00 00 48 39 d8 75 0b eb 12 <48> 8b 40 08 48 39 d8 74 09 48 3b 50 f8 76 f1 48 89 c3 4c 8b 2b [ 441.667232] INFO: rcu_sched self-detected stall on CPU [ 441.671381] INFO: rcu_sched detected stalls on CPUs/tasks: [ 441.671421] 0-...: (16248 ticks this GP) idle=692/140000000000001/0 softirq=9034/9034 fqs=7619 [ 441.671442] (detected by 1, t=16252 jiffies, g=10671, c=10670, q=3144) [ 441.671446] Sending NMI from CPU 1 to CPUs 0: [ 441.672468] NMI backtrace for cpu 0 [ 441.672469] CPU: 0 PID: 3 Comm: kworker/0:0 Tainted: G L 4.14.39-ceph-g7d6240f0fb85 #1 [ 441.672470] Hardware name: Supermicro X8SIL/X8SIL, BIOS 1.0c 02/25/2010 [ 441.672470] Workqueue: kcopyd do_work [ 441.672471] task: ffff8848ad63ca80 task.stack: ffff9891418bc000 [ 441.672471] RIP: 0010:delay_tsc+0x19/0x70 [ 441.672472] RSP: 0018:ffff8848bfc03bf0 EFLAGS: 00000003 [ 441.672473] RAX: 00000000cf0dc50a RBX: ffffffffbb54e820 RCX: 000000000124fb70 [ 441.672473] RDX: 0000000000000122 RSI: 0000000000000005 RDI: 000000000000094d [ 441.672474] RBP: 000000000000270d R08: 0000000000000000 R09: 0000000000000000 [ 441.672474] R10: ffff8848bfc03bd0 R11: ffffffffb9588b47 R12: 0000000000000020 [ 441.672475] R13: ffffffffbb4863da R14: 000000000000003a R15: 0000000000000046 [ 441.672475] FS: 0000000000000000(0000) GS:ffff8848bfc00000(0000) knlGS:0000000000000000 [ 441.672476] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 441.672476] CR2: 0000560624729220 CR3: 0000000238210000 CR4: 00000000000006f0 [ 441.672477] Call Trace: [ 441.672477] <IRQ> [ 441.672478] wait_for_xmitr+0x25/0xb0 [ 441.672478] serial8250_console_putchar+0x16/0x30 [ 441.672479] ? serial8250_put_poll_char+0xd0/0xd0 [ 441.672479] uart_console_write+0x2e/0x70 [ 441.672480] serial8250_console_write+0xe3/0x2c0 [ 441.672480] ? console_unlock+0x1a2/0x580 [ 441.672480] console_unlock+0x451/0x580 [ 441.672481] vprintk_emit+0x23c/0x3c0 [ 441.672481] printk+0x43/0x4b [ 441.672482] rcu_check_callbacks+0x502/0x950 [ 441.672482] ? tick_sched_do_timer+0x30/0x30 [ 441.672483] update_process_times+0x28/0x50 [ 441.672483] tick_sched_handle+0x23/0x60 [ 441.672483] tick_sched_timer+0x34/0x70 [ 441.672484] __hrtimer_run_queues+0xde/0x4c0 [ 441.672484] hrtimer_interrupt+0xad/0x210 [ 441.672485] hpet_interrupt_handler+0x15/0x30 [ 441.672485] __handle_irq_event_percpu+0x42/0x350 [ 441.672486] handle_irq_event_percpu+0x20/0x50 [ 441.672486] handle_irq_event+0x37/0x60 [ 441.672487] handle_edge_irq+0x90/0x230 [ 441.672487] handle_irq+0x74/0x110 [ 441.672487] do_IRQ+0x57/0x110 [ 441.672488] common_interrupt+0x87/0x87 [ 441.672488] </IRQ> [ 441.672489] RIP: 0010:copy_callback+0x47/0x190 [dm_snapshot] [ 441.672489] RSP: 0018:ffff9891418bfd40 EFLAGS: 00000287 ORIG_RAX: ffffffffffffffbe [ 441.672490] RAX: ffff884510d4e238 RBX: ffff8848a72e64c8 RCX: 000000000000000f [ 441.672491] RDX: 000000000004aef8 RSI: 0000000000000000 RDI: 0000000000000000 [ 441.672491] RBP: ffff88450fe34c30 R08: 00000000c9659001 R09: 0000000180070004 [ 441.672492] R10: ffff9891418bfc70 R11: 0000000000000000 R12: ffffffffc0685550 [ 441.672492] R13: 0000000000000000 R14: 0000000000000000 R15: ffff88450fe34c30 [ 441.672493] ? origin_resume+0x80/0x80 [dm_snapshot] [ 441.672493] ? origin_resume+0x80/0x80 [dm_snapshot] [ 441.672494] run_complete_job+0x5f/0xb0 [ 441.672494] process_jobs+0x97/0x210 [ 441.672494] ? drop_pages+0x40/0x40 [ 441.672495] do_work+0x45/0x90 [ 441.672495] ? process_one_work+0x1cf/0x610 [ 441.672496] ? process_jobs+0x210/0x210 [ 441.672496] process_one_work+0x1cf/0x610 [ 441.672497] worker_thread+0x204/0x3e0 [ 441.672497] kthread+0x10a/0x140 [ 441.672498] ? process_one_work+0x610/0x610 [ 441.672498] ? kthread_stop+0x290/0x290 [ 441.672498] ret_from_fork+0x3a/0x50 [ 441.672499] Code: 8b 05 fc 8c a8 00 48 8d 7a 01 e9 c3 06 34 00 0f 1f 00 66 66 66 66 90 65 ff 05 c4 33 75 46 65 44 8b 05 34 c8 74 46 0f ae e8 0f 31 <48> c1 e2 20 48 89 d6 48 09 c6 0f ae e8 0f 31 48 c1 e2 20 48 09 [ 441.694280] 0-...: (16248 ticks this GP) idle=692/140000000000001/0 softirq=9034/9034 fqs=7622 [ 441.703422] (t=16260 jiffies g=10671 c=10670 q=3144) [ 441.708749] NMI backtrace for cpu 0 [ 441.708766] CPU: 0 PID: 3 Comm: kworker/0:0 Tainted: G L 4.14.39-ceph-g7d6240f0fb85 #1 [ 441.708768] Hardware name: Supermicro X8SIL/X8SIL, BIOS 1.0c 02/25/2010 [ 441.708770] Workqueue: kcopyd do_work [ 441.708772] Call Trace: [ 441.708774] <IRQ> [ 441.708778] dump_stack+0x7c/0xbe [ 441.708781] nmi_cpu_backtrace+0xc0/0xd0 [ 441.708785] ? irq_force_complete_move+0x120/0x120 [ 441.708788] nmi_trigger_cpumask_backtrace+0xea/0x120 [ 441.708791] rcu_dump_cpu_stacks+0x99/0xcb [ 441.708795] rcu_check_callbacks+0x79e/0x950 [ 441.708800] ? tick_sched_do_timer+0x30/0x30 [ 441.708803] update_process_times+0x28/0x50 [ 441.708806] tick_sched_handle+0x23/0x60 [ 441.708809] tick_sched_timer+0x34/0x70 [ 441.708812] __hrtimer_run_queues+0xde/0x4c0 [ 441.708817] hrtimer_interrupt+0xad/0x210 [ 441.708823] hpet_interrupt_handler+0x15/0x30 [ 441.708825] __handle_irq_event_percpu+0x42/0x350 [ 441.708829] handle_irq_event_percpu+0x20/0x50 [ 441.708832] handle_irq_event+0x37/0x60 [ 441.708836] handle_edge_irq+0x90/0x230 [ 441.708838] handle_irq+0x74/0x110 [ 441.708841] do_IRQ+0x57/0x110 [ 441.708845] common_interrupt+0x87/0x87 [ 441.708847] </IRQ> [ 441.708850] RIP: 0010:copy_callback+0x47/0x190 [dm_snapshot] [ 441.708852] RSP: 0018:ffff9891418bfd40 EFLAGS: 00000287 ORIG_RAX: ffffffffffffffbe [ 441.708855] RAX: ffff884510d4e238 RBX: ffff8848a72e64c8 RCX: 000000000000000f [ 441.708856] RDX: 000000000004aef8 RSI: 0000000000000000 RDI: 0000000000000000 [ 441.708858] RBP: ffff88450fe34c30 R08: 00000000c9659001 R09: 0000000180070004 [ 441.708859] R10: ffff9891418bfc70 R11: 0000000000000000 R12: ffffffffc0685550 [ 441.708861] R13: 0000000000000000 R14: 0000000000000000 R15: ffff88450fe34c30 [ 441.708865] ? origin_resume+0x80/0x80 [dm_snapshot] [ 441.708874] ? origin_resume+0x80/0x80 [dm_snapshot] [ 441.708877] run_complete_job+0x5f/0xb0 [ 441.708881] process_jobs+0x97/0x210 [ 441.708883] ? drop_pages+0x40/0x40 [ 441.708888] do_work+0x45/0x90 [ 441.708894] ? process_one_work+0x1cf/0x610 [ 441.708896] ? process_jobs+0x210/0x210 [ 441.708898] process_one_work+0x1cf/0x610 [ 441.708904] worker_thread+0x204/0x3e0 [ 441.708909] kthread+0x10a/0x140 [ 441.708912] ? process_one_work+0x610/0x610 [ 441.708913] ? kthread_stop+0x290/0x290 [ 441.708917] ret_from_fork+0x3a/0x50 [ 458.794778] perf: interrupt took too long (2503 > 2500), lowering kernel.perf_event_max_sample_rate to 79750 [ 503.821159] rbd: rbd0: encountered watch error: -107 [ 504.202478] watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [kworker/0:0:3] [ 504.209975] Modules linked in: dm_snapshot dm_bufio xfs rbd libceph libcrc32c gpio_ich intel_powerclamp coretemp kvm_intel kvm irqbypass ipmi_si joydev serio_raw lpc_ich ipmi_devintf i7core_edac shpchp ipmi_msghandler ib_iser rdma_cm iw_cm ib_cm ib_core configfs nfsd auth_rpcgss iscsi_tcp nfs_acl lockd libiscsi_tcp grace libiscsi scsi_transport_iscsi sunrpc lp parport autofs4 btrfs xor zstd_decompress zstd_compress xxhash raid6_pq hid_generic usbhid hid e1000e ahci ptp psmouse libahci pps_core arcmsr [ 504.210118] irq event stamp: 21161206 [ 504.210127] hardirqs last enabled at (21161205): [<ffffffffb924c1bc>] kmem_cache_free+0xac/0x2f0 [ 504.210134] hardirqs last disabled at (21161206): [<ffffffffb9a00942>] common_interrupt+0x82/0x87 [ 504.210140] softirqs last enabled at (21161114): [<ffffffffb9c00393>] __do_softirq+0x393/0x4a6 [ 504.210146] softirqs last disabled at (21161107): [<ffffffffb908c3a1>] irq_exit+0xc1/0xd0 [ 504.210152] CPU: 0 PID: 3 Comm: kworker/0:0 Tainted: G L 4.14.39-ceph-g7d6240f0fb85 #1 [ 504.210155] Hardware name: Supermicro X8SIL/X8SIL, BIOS 1.0c 02/25/2010 [ 504.210161] Workqueue: kcopyd do_work [ 504.210167] task: ffff8848ad63ca80 task.stack: ffff9891418bc000 [ 504.210176] RIP: 0010:copy_callback+0x4b/0x190 [dm_snapshot] [ 504.210179] RSP: 0018:ffff9891418bfd40 EFLAGS: 00000293 ORIG_RAX: ffffffffffffffbe [ 504.210185] RAX: ffff8845abc45238 RBX: ffff8848a72e64c8 RCX: 000000000000000f [ 504.210188] RDX: 0000000000076e30 RSI: 0000000000000000 RDI: 0000000000000000 [ 504.210192] RBP: ffff8845b6bbe2d0 R08: 00000000b17b4101 R09: 0000000180070001 [ 504.210195] R10: ffff9891418bfc70 R11: 0000000000000000 R12: ffffffffc0685550 [ 504.210199] R13: 0000000000000000 R14: 0000000000000000 R15: ffff8845b6bbe2d0 [ 504.210203] FS: 0000000000000000(0000) GS:ffff8848bfc00000(0000) knlGS:0000000000000000 [ 504.210206] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 504.210210] CR2: 0000560624729220 CR3: 0000000238210000 CR4: 00000000000006f0 [ 504.210213] Call Trace: [ 504.210225] ? origin_resume+0x80/0x80 [dm_snapshot] [ 504.210230] run_complete_job+0x5f/0xb0 [ 504.210240] process_jobs+0x97/0x210 [ 504.210245] ? drop_pages+0x40/0x40 [ 504.210254] do_work+0x45/0x90 [ 504.210268] ? process_one_work+0x1cf/0x610 [ 504.210272] ? process_jobs+0x210/0x210 [ 504.210277] process_one_work+0x1cf/0x610 [ 504.210290] worker_thread+0x204/0x3e0 [ 504.210302] kthread+0x10a/0x140 [ 504.210307] ? process_one_work+0x610/0x610 [ 504.210310] ? kthread_stop+0x290/0x290 [ 504.210318] ret_from_fork+0x3a/0x50 [ 504.210337] Code: 8b 55 50 0f b6 c0 89 45 4c 48 3b 93 c0 00 00 00 74 56 48 8b 83 d0 00 00 00 48 81 c3 c8 00 00 00 48 39 d8 75 0b eb 12 48 8b 40 08 <48> 39 d8 74 09 48 3b 50 f8 76 f1 48 89 c3 4c 8b 2b 4c 8d 65 58 [ 572.217666] watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [kworker/2:2:120] [ 572.225087] Modules linked in: dm_snapshot dm_bufio xfs rbd libceph libcrc32c gpio_ich intel_powerclamp coretemp kvm_intel kvm irqbypass ipmi_si joydev serio_raw lpc_ich ipmi_devintf i7core_edac shpchp ipmi_msghandler ib_iser rdma_cm iw_cm ib_cm ib_core configfs nfsd auth_rpcgss iscsi_tcp nfs_acl lockd libiscsi_tcp grace libiscsi scsi_transport_iscsi sunrpc lp parport autofs4 btrfs xor zstd_decompress zstd_compress xxhash raid6_pq hid_generic usbhid hid e1000e ahci ptp psmouse libahci pps_core arcmsr [ 572.225159] irq event stamp: 6250656 [ 572.225165] hardirqs last enabled at (6250655): [<ffffffffb924c1bc>] kmem_cache_free+0xac/0x2f0 [ 572.225169] hardirqs last disabled at (6250656): [<ffffffffb9a00942>] common_interrupt+0x82/0x87 [ 572.225172] softirqs last enabled at (5250654): [<ffffffffb9c00393>] __do_softirq+0x393/0x4a6 [ 572.225175] softirqs last disabled at (5250629): [<ffffffffb908c3a1>] irq_exit+0xc1/0xd0 [ 572.225178] CPU: 2 PID: 120 Comm: kworker/2:2 Tainted: G L 4.14.39-ceph-g7d6240f0fb85 #1 [ 572.225180] Hardware name: Supermicro X8SIL/X8SIL, BIOS 1.0c 02/25/2010 [ 572.225184] Workqueue: kcopyd do_work [ 572.225186] task: ffff8848a6f40000 task.stack: ffff989142358000 [ 572.225192] RIP: 0010:copy_callback+0x47/0x190 [dm_snapshot] [ 572.225193] RSP: 0018:ffff98914235bd40 EFLAGS: 00000297 ORIG_RAX: ffffffffffffff9e [ 572.225196] RAX: ffff8845160b4940 RBX: ffff8848a72e64c8 RCX: 000000000000000f [ 572.225198] RDX: 000000000009c882 RSI: 0000000000000000 RDI: 0000000000000000 [ 572.225199] RBP: ffff8845a78b20f0 R08: 000000009cf9d101 R09: 0000000180070000 [ 572.225201] R10: ffff98914235bc70 R11: 0000000000000000 R12: ffffffffc0685550 [ 572.225203] R13: 0000000000000000 R14: 0000000000000000 R15: ffff8845a78b20f0 [ 572.225205] FS: 0000000000000000(0000) GS:ffff8848bfc80000(0000) knlGS:0000000000000000 [ 572.225206] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 572.225208] CR2: 00007f7eed163010 CR3: 0000000238210000 CR4: 00000000000006e0 [ 572.225210] Call Trace: [ 572.225217] ? origin_resume+0x80/0x80 [dm_snapshot] [ 572.225219] run_complete_job+0x5f/0xb0 [ 572.225224] process_jobs+0x97/0x210 [ 572.225226] ? drop_pages+0x40/0x40 [ 572.225231] do_work+0x45/0x90 [ 572.225238] ? process_one_work+0x1cf/0x610 [ 572.225240] ? process_jobs+0x210/0x210 [ 572.225242] process_one_work+0x1cf/0x610 [ 572.225249] worker_thread+0x204/0x3e0 [ 572.225255] kthread+0x10a/0x140 [ 572.225257] ? process_one_work+0x610/0x610 [ 572.225259] ? kthread_stop+0x290/0x290 [ 572.225262] ? kthread_stop+0x290/0x290 [ 572.225264] ret_from_fork+0x3a/0x50 [ 572.225273] Code: c0 09 d0 48 8b 55 50 0f b6 c0 89 45 4c 48 3b 93 c0 00 00 00 74 56 48 8b 83 d0 00 00 00 48 81 c3 c8 00 00 00 48 39 d8 75 0b eb 12 <48> 8b 40 08 48 39 d8 74 09 48 3b 50 f8 76 f1 48 89 c3 4c 8b 2b [ 576.621922] rbd: rbd0: encountered watch error: -107 [ 776.215313] watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [kworker/2:2:120] [ 776.222852] Modules linked in: dm_snapshot dm_bufio xfs rbd libceph libcrc32c gpio_ich intel_powerclamp coretemp kvm_intel kvm irqbypass ipmi_si joydev serio_raw lpc_ich ipmi_devintf i7core_edac shpchp ipmi_msghandler ib_iser rdma_cm iw_cm ib_cm ib_core configfs nfsd auth_rpcgss iscsi_tcp nfs_acl lockd libiscsi_tcp grace libiscsi scsi_transport_iscsi sunrpc lp parport autofs4 btrfs xor zstd_decompress zstd_compress xxhash raid6_pq hid_generic usbhid hid e1000e ahci ptp psmouse libahci pps_core arcmsr [ 776.222924] irq event stamp: 37494502 [ 776.222930] hardirqs last enabled at (37494501): [<ffffffffb924c1bc>] kmem_cache_free+0xac/0x2f0 [ 776.222934] hardirqs last disabled at (37494502): [<ffffffffb9a00942>] common_interrupt+0x82/0x87 [ 776.222937] softirqs last enabled at (36754820): [<ffffffffb9c00393>] __do_softirq+0x393/0x4a6 [ 776.222941] softirqs last disabled at (36754811): [<ffffffffb908c3a1>] irq_exit+0xc1/0xd0 [ 776.222944] CPU: 2 PID: 120 Comm: kworker/2:2 Tainted: G L 4.14.39-ceph-g7d6240f0fb85 #1 [ 776.222945] Hardware name: Supermicro X8SIL/X8SIL, BIOS 1.0c 02/25/2010 [ 776.222949] Workqueue: kcopyd do_work [ 776.222952] task: ffff8848a6f40000 task.stack: ffff989142358000 [ 776.222957] RIP: 0010:copy_callback+0x47/0x190 [dm_snapshot] [ 776.222959] RSP: 0018:ffff98914235bd40 EFLAGS: 00000283 ORIG_RAX: ffffffffffffff9e [ 776.222962] RAX: ffff8848a4e66850 RBX: ffff8848a72e64c8 RCX: 000000000000000f [ 776.222963] RDX: 0000000000150329 RSI: 0000000000000000 RDI: 0000000000000000 [ 776.222965] RBP: ffff884861e02690 R08: 00000000e6503001 R09: 0000000180070002 [ 776.222967] R10: ffff98914235bc70 R11: 0000000000000000 R12: ffffffffc0685550 [ 776.222968] R13: 0000000000000000 R14: 0000000000000000 R15: ffff884861e02690 [ 776.222970] FS: 0000000000000000(0000) GS:ffff8848bfc80000(0000) knlGS:0000000000000000 [ 776.222972] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 776.222974] CR2: 00007f7eed163010 CR3: 0000000238210000 CR4: 00000000000006e0 [ 776.222976] Call Trace: [ 776.222982] ? origin_resume+0x80/0x80 [dm_snapshot] [ 776.222985] run_complete_job+0x5f/0xb0 [ 776.222990] process_jobs+0x97/0x210 [ 776.222992] ? drop_pages+0x40/0x40 [ 776.222996] do_work+0x45/0x90 [ 776.223004] ? process_one_work+0x1cf/0x610 [ 776.223006] ? process_jobs+0x210/0x210 [ 776.223008] process_one_work+0x1cf/0x610 [ 776.223014] worker_thread+0x204/0x3e0 [ 776.223020] kthread+0x10a/0x140 [ 776.223023] ? process_one_work+0x610/0x610 [ 776.223025] ? kthread_stop+0x290/0x290 [ 776.223028] ? kthread_stop+0x290/0x290 [ 776.223030] ret_from_fork+0x3a/0x50 [ 776.223039] Code: c0 09 d0 48 8b 55 50 0f b6 c0 89 45 4c 48 3b 93 c0 00 00 00 74 56 48 8b 83 d0 00 00 00 48 81 c3 c8 00 00 00 48 39 d8 75 0b eb 12 <48> 8b 40 08 48 39 d8 74 09 48 3b 50 f8 76 f1 48 89 c3 4c 8b 2b [ 787.554471] libceph: mon0 172.21.8.134:6789 session lost, hunting for new mon [ 787.594324] libceph: mon2 172.21.8.134:6790 session established [ 788.161173] rbd: rbd0: encountered watch error: -107 [ 868.214217] watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [kworker/2:2:120] [ 868.221785] Modules linked in: dm_snapshot dm_bufio xfs rbd libceph libcrc32c gpio_ich intel_powerclamp coretemp kvm_intel kvm irqbypass ipmi_si joydev serio_raw lpc_ich ipmi_devintf i7core_edac shpchp ipmi_msghandler ib_iser rdma_cm iw_cm ib_cm ib_core configfs nfsd auth_rpcgss iscsi_tcp nfs_acl lockd libiscsi_tcp grace libiscsi scsi_transport_iscsi sunrpc lp parport autofs4 btrfs xor zstd_decompress zstd_compress xxhash raid6_pq hid_generic usbhid hid e1000e ahci ptp psmouse libahci pps_core arcmsr [ 868.221858] irq event stamp: 49247714 [ 868.221863] hardirqs last enabled at (49247713): [<ffffffffb924c1bc>] kmem_cache_free+0xac/0x2f0 [ 868.221867] hardirqs last disabled at (49247714): [<ffffffffb9a00942>] common_interrupt+0x82/0x87 [ 868.221870] softirqs last enabled at (48360804): [<ffffffffb9c00393>] __do_softirq+0x393/0x4a6 [ 868.221874] softirqs last disabled at (48360791): [<ffffffffb908c3a1>] irq_exit+0xc1/0xd0 [ 868.221877] CPU: 2 PID: 120 Comm: kworker/2:2 Tainted: G L 4.14.39-ceph-g7d6240f0fb85 #1 [ 868.221878] Hardware name: Supermicro X8SIL/X8SIL, BIOS 1.0c 02/25/2010 [ 868.221883] Workqueue: kcopyd do_work [ 868.221885] task: ffff8848a6f40000 task.stack: ffff989142358000 [ 868.221890] RIP: 0010:copy_callback+0x47/0x190 [dm_snapshot] [ 868.221892] RSP: 0018:ffff98914235bd40 EFLAGS: 00000297 ORIG_RAX: ffffffffffffff9e [ 868.221895] RAX: ffff8847300e0490 RBX: ffff8848a72e64c8 RCX: 000000000000000f [ 868.221897] RDX: 00000000001a1e68 RSI: 0000000000000000 RDI: 0000000000000000 [ 868.221898] RBP: ffff884732b4cc30 R08: 00000000321cb001 R09: 0000000180070002 [ 868.221900] R10: ffff98914235bc70 R11: 0000000000000000 R12: ffffffffc0685550 [ 868.221901] R13: 0000000000000000 R14: 0000000000000000 R15: ffff884732b4cc30 [ 868.221903] FS: 0000000000000000(0000) GS:ffff8848bfc80000(0000) knlGS:0000000000000000 [ 868.221905] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 868.221907] CR2: 00007f7eed163010 CR3: 0000000238210000 CR4: 00000000000006e0 [ 868.221909] Call Trace: [ 868.221916] ? origin_resume+0x80/0x80 [dm_snapshot] [ 868.221918] run_complete_job+0x5f/0xb0 [ 868.221923] process_jobs+0x97/0x210 [ 868.221926] ? drop_pages+0x40/0x40 [ 868.221930] do_work+0x45/0x90 [ 868.221938] ? process_one_work+0x1cf/0x610 [ 868.221939] ? process_jobs+0x210/0x210 [ 868.221941] process_one_work+0x1cf/0x610 [ 868.221948] worker_thread+0x204/0x3e0 [ 868.221954] kthread+0x10a/0x140 [ 868.221956] ? process_one_work+0x610/0x610 [ 868.221958] ? kthread_stop+0x290/0x290 [ 868.221961] ? kthread_stop+0x290/0x290 [ 868.221963] ret_from_fork+0x3a/0x50 [ 868.221972] Code: c0 09 d0 48 8b 55 50 0f b6 c0 89 45 4c 48 3b 93 c0 00 00 00 74 56 48 8b 83 d0 00 00 00 48 81 c3 c8 00 00 00 48 39 d8 75 0b eb 12 <48> 8b 40 08 48 39 d8 74 09 48 3b 50 f8 76 f1 48 89 c3 4c 8b 2b [ 882.902452] rbd: rbd0: encountered watch error: -107 [ 882.903177] libceph: mon2 172.21.8.134:6790 session lost, hunting for new mon [ 882.943953] libceph: mon0 172.21.7.140:6789 session established [ 956.213279] watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [kworker/2:2:120] [ 956.220796] Modules linked in: dm_snapshot dm_bufio xfs rbd libceph libcrc32c gpio_ich intel_powerclamp coretemp kvm_intel kvm irqbypass ipmi_si joydev serio_raw lpc_ich ipmi_devintf i7core_edac shpchp ipmi_msghandler ib_iser rdma_cm iw_cm ib_cm ib_core configfs nfsd auth_rpcgss iscsi_tcp nfs_acl lockd libiscsi_tcp grace libiscsi scsi_transport_iscsi sunrpc lp parport autofs4 btrfs xor zstd_decompress zstd_compress xxhash raid6_pq hid_generic usbhid hid e1000e ahci ptp psmouse libahci pps_core arcmsr [ 956.220872] irq event stamp: 60295066 [ 956.220878] hardirqs last enabled at (60295065): [<ffffffffb924c1bc>] kmem_cache_free+0xac/0x2f0 [ 956.220881] hardirqs last disabled at (60295066): [<ffffffffb9a00942>] common_interrupt+0x82/0x87 [ 956.220884] softirqs last enabled at (59230404): [<ffffffffb9c00393>] __do_softirq+0x393/0x4a6 [ 956.220888] softirqs last disabled at (59230383): [<ffffffffb908c3a1>] irq_exit+0xc1/0xd0 [ 956.220891] CPU: 2 PID: 120 Comm: kworker/2:2 Tainted: G L 4.14.39-ceph-g7d6240f0fb85 #1 [ 956.220892] Hardware name: Supermicro X8SIL/X8SIL, BIOS 1.0c 02/25/2010 [ 956.220897] Workqueue: kcopyd do_work [ 956.220899] task: ffff8848a6f40000 task.stack: ffff989142358000 [ 956.220905] RIP: 0010:copy_callback+0x47/0x190 [dm_snapshot] [ 956.220906] RSP: 0018:ffff98914235bd40 EFLAGS: 00000297 ORIG_RAX: ffffffffffffff9e [ 956.220909] RAX: ffff88475ee1afd0 RBX: ffff8848a72e64c8 RCX: 000000000000000f [ 956.220911] RDX: 00000000001be9d8 RSI: 0000000000000000 RDI: 0000000000000000 [ 956.220913] RBP: ffff884870de74b0 R08: 00000000f168a001 R09: 0000000180070003 [ 956.220914] R10: ffff98914235bc70 R11: 0000000000000000 R12: ffffffffc0685550 [ 956.220916] R13: 0000000000000000 R14: 0000000000000000 R15: ffff884870de74b0 [ 956.220918] FS: 0000000000000000(0000) GS:ffff8848bfc80000(0000) knlGS:0000000000000000 [ 956.220920] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 956.220921] CR2: 00007fe2a394e0d1 CR3: 0000000238210000 CR4: 00000000000006e0 [ 956.220923] Call Trace: [ 956.220930] ? origin_resume+0x80/0x80 [dm_snapshot] [ 956.220932] run_complete_job+0x5f/0xb0 [ 956.220937] process_jobs+0x97/0x210 [ 956.220939] ? drop_pages+0x40/0x40 [ 956.220944] do_work+0x45/0x90 [ 956.220951] ? process_one_work+0x1cf/0x610 [ 956.220953] ? process_jobs+0x210/0x210 [ 956.220955] process_one_work+0x1cf/0x610 [ 956.220962] worker_thread+0x204/0x3e0 [ 956.220968] kthread+0x10a/0x140 [ 956.220970] ? process_one_work+0x610/0x610 [ 956.220972] ? kthread_stop+0x290/0x290 [ 956.220975] ? kthread_stop+0x290/0x290 [ 956.220977] ret_from_fork+0x3a/0x50 [ 956.220986] Code: c0 09 d0 48 8b 55 50 0f b6 c0 89 45 4c 48 3b 93 c0 00 00 00 74 56 48 8b 83 d0 00 00 00 48 81 c3 c8 00 00 00 48 39 d8 75 0b eb 12 <48> 8b 40 08 48 39 d8 74 09 48 3b 50 f8 76 f1 48 89 c3 4c 8b 2b [ 976.205376] libceph: mon0 172.21.7.140:6789 session lost, hunting for new mon [ 976.215595] rbd: rbd0: encountered watch error: -107 [ 977.259897] libceph: mon1 172.21.8.134:6789 session established [ 1022.307064] XFS (dm-0): Unmounting Filesystem
[ 2380.071730] XFS (dm-0): Mounting V5 Filesystem [ 2380.193708] XFS (dm-0): Ending clean mount [ 2568.169967] watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [kworker/1:0:2008] [ 2568.177642] Modules linked in: dm_snapshot dm_bufio xfs rbd libceph binfmt_misc gpio_ich intel_powerclamp ipmi_si ipmi_devintf coretemp serio_raw ipmi_msghandler i7core_edac joydev kvm_intel kvm lpc_ich irqbypass nfsd auth_rpcgss nfs_acl lockd grace sunrpc ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi lp parport autofs4 btrfs xor zstd_decompress zstd_compress xxhash raid6_pq libcrc32c hid_generic usbhid hid ahci psmouse libahci e1000e arcmsr [ 2568.177739] irq event stamp: 3795554 [ 2568.177745] hardirqs last enabled at (3795553): [<ffffffffa426ca19>] kmem_cache_free+0x109/0x2b0 [ 2568.177750] hardirqs last disabled at (3795554): [<ffffffffa4a00964>] interrupt_entry+0xc4/0xe0 [ 2568.177754] softirqs last enabled at (3398512): [<ffffffffa4c00455>] __do_softirq+0x455/0x491 [ 2568.177759] softirqs last disabled at (3398429): [<ffffffffa408fd3d>] irq_exit+0x5d/0xd0 [ 2568.177763] CPU: 1 PID: 2008 Comm: kworker/1:0 Not tainted 4.18.0-rc7-ceph-g8cb27127e6dd #1 [ 2568.177765] Hardware name: Supermicro X8SIL/X8SIL, BIOS 1.0c 02/25/2010 [ 2568.177770] Workqueue: kcopyd do_work [ 2568.177779] RIP: 0010:copy_callback+0x146/0x190 [dm_snapshot] [ 2568.177780] Code: c0 48 8b 07 0f 94 c2 ff 50 30 48 8b 85 c8 00 00 00 49 39 c4 75 84 eb 45 48 8b ad d0 00 00 00 49 39 ec 74 0f 48 3b 45 f8 77 09 <48> 8b 6d 08 49 39 ec 75 f1 4c 8b 6d 00 4c 8d 63 58 48 89 ee 4c 89 [ 2568.177880] RSP: 0018:ffffa946c521bd30 EFLAGS: 00000283 ORIG_RAX: ffffffffffffffdd [ 2568.177885] RAX: 0000000000005adc RBX: ffff910d8700a780 RCX: 000000000000000f [ 2568.177887] RDX: ffff910d8700a700 RSI: 0000000000000000 RDI: 0000000000000000 [ 2568.177889] RBP: ffff910b7299ce68 R08: 0000000000000000 R09: 0000000000000001 [ 2568.177891] R10: 0000000000000000 R11: 0000000000000000 R12: ffff910d8e8768c8 [ 2568.177893] R13: 0000000000000000 R14: 0000000000000000 R15: ffff910d8700a780 [ 2568.177895] FS: 0000000000000000(0000) GS:ffff910f3fc40000(0000) knlGS:0000000000000000 [ 2568.177898] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2568.177900] CR2: 00007fb8bec23000 CR3: 0000000047210000 CR4: 00000000000006e0 [ 2568.177902] Call Trace: [ 2568.177911] ? origin_resume+0x80/0x80 [dm_snapshot] [ 2568.177916] run_complete_job+0x75/0xa0 [ 2568.177921] ? client_free_pages+0x30/0x30 [ 2568.177925] process_jobs+0x41/0x230 [ 2568.177932] do_work+0x57/0xb0 [ 2568.177943] ? process_one_work+0x326/0x610 [ 2568.177946] process_one_work+0x326/0x610 [ 2568.177955] worker_thread+0x22e/0x390 [ 2568.177961] ? process_one_work+0x610/0x610 [ 2568.177965] kthread+0x120/0x130 [ 2568.177969] ? kthread_create_worker_on_cpu+0x70/0x70 [ 2568.177974] ret_from_fork+0x3a/0x50 [ 2908.166152] watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [kworker/2:9:1705] [ 2908.173787] Modules linked in: dm_snapshot dm_bufio xfs rbd libceph binfmt_misc gpio_ich intel_powerclamp ipmi_si ipmi_devintf coretemp serio_raw ipmi_msghandler i7core_edac joydev kvm_intel kvm lpc_ich irqbypass nfsd auth_rpcgss nfs_acl lockd grace sunrpc ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi lp parport autofs4 btrfs xor zstd_decompress zstd_compress xxhash raid6_pq libcrc32c hid_generic usbhid hid ahci psmouse libahci e1000e arcmsr [ 2908.173856] irq event stamp: 6970532 [ 2908.173862] hardirqs last enabled at (6970531): [<ffffffffa491fc13>] _raw_spin_unlock_irqrestore+0x43/0x60 [ 2908.173864] hardirqs last disabled at (6970532): [<ffffffffa4a00964>] interrupt_entry+0xc4/0xe0 [ 2908.173867] softirqs last enabled at (6059284): [<ffffffffa4c00455>] __do_softirq+0x455/0x491 [ 2908.173870] softirqs last disabled at (6059249): [<ffffffffa408fd3d>] irq_exit+0x5d/0xd0 [ 2908.173874] CPU: 2 PID: 1705 Comm: kworker/2:9 Tainted: G L 4.18.0-rc7-ceph-g8cb27127e6dd #1 [ 2908.173875] Hardware name: Supermicro X8SIL/X8SIL, BIOS 1.0c 02/25/2010 [ 2908.173880] Workqueue: kcopyd do_work [ 2908.173885] RIP: 0010:rcu_lockdep_current_cpu_online+0x17/0x60 [ 2908.173886] Code: 0d ee 80 f0 5b 83 e0 01 c3 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 65 8b 15 d4 80 f0 5b 81 e2 00 00 10 00 b8 01 00 00 00 <75> 36 65 ff 05 c0 80 f0 5b 48 c7 c2 40 47 02 00 65 48 03 15 d9 14 [ 2908.173958] RSP: 0018:ffffa946c4a7bcf8 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffdd [ 2908.173961] RAX: 0000000000000001 RBX: fffff33680d6dd00 RCX: 0000000000000001 [ 2908.173962] RDX: 0000000000000000 RSI: 0000000000000002 RDI: fffff33680d6dd00 [ 2908.173963] RBP: 0000000000035b74 R08: 0000000000000000 R09: 0000000000000000 [ 2908.173965] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ 2908.173966] R13: 0000000000000000 R14: 0000000000000000 R15: ffff910d17d87f78 [ 2908.173968] FS: 0000000000000000(0000) GS:ffff910f3fc80000(0000) knlGS:0000000000000000 [ 2908.173970] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2908.173971] CR2: 00007f8d84e4f9e0 CR3: 0000000047210000 CR4: 00000000000006e0 [ 2908.173973] Call Trace: [ 2908.173977] rcu_read_lock_sched_held+0x43/0x90 [ 2908.173981] free_unref_page_prepare+0xf0/0x220 [ 2908.173986] free_unref_page+0x22/0x80 [ 2908.173989] kcopyd_put_pages+0x25/0x50 [ 2908.173995] ? origin_resume+0x80/0x80 [dm_snapshot] [ 2908.173997] run_complete_job+0x48/0xa0 [ 2908.174000] ? client_free_pages+0x30/0x30 [ 2908.174003] process_jobs+0x41/0x230 [ 2908.174009] do_work+0x57/0xb0 [ 2908.174016] ? process_one_work+0x326/0x610 [ 2908.174018] process_one_work+0x326/0x610 [ 2908.174025] worker_thread+0x22e/0x390 [ 2908.174029] ? process_one_work+0x610/0x610 [ 2908.174032] kthread+0x120/0x130 [ 2908.174035] ? kthread_create_worker_on_cpu+0x70/0x70 [ 2908.174038] ret_from_fork+0x3a/0x50 [ 2909.124605] libceph: mon1 172.21.7.140:6789 session lost, hunting for new mon [ 2909.200421] libceph: mon2 172.21.8.134:6790 session established [ 2984.165410] watchdog: BUG: soft lockup - CPU#2 stuck for 23s! [kworker/2:9:1705] [ 2984.172956] Modules linked in: dm_snapshot dm_bufio xfs rbd libceph binfmt_misc gpio_ich intel_powerclamp ipmi_si ipmi_devintf coretemp serio_raw ipmi_msghandler i7core_edac joydev kvm_intel kvm lpc_ich irqbypass nfsd auth_rpcgss nfs_acl lockd grace sunrpc ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi lp parport autofs4 btrfs xor zstd_decompress zstd_compress xxhash raid6_pq libcrc32c hid_generic usbhid hid ahci psmouse libahci e1000e arcmsr [ 2984.173026] irq event stamp: 16939392 [ 2984.173031] hardirqs last enabled at (16939391): [<ffffffffa426ca19>] kmem_cache_free+0x109/0x2b0 [ 2984.173034] hardirqs last disabled at (16939392): [<ffffffffa4a00964>] interrupt_entry+0xc4/0xe0 [ 2984.173038] softirqs last enabled at (16008220): [<ffffffffa4c00455>] __do_softirq+0x455/0x491 [ 2984.173041] softirqs last disabled at (16008181): [<ffffffffa408fd3d>] irq_exit+0x5d/0xd0 [ 2984.173044] CPU: 2 PID: 1705 Comm: kworker/2:9 Tainted: G L 4.18.0-rc7-ceph-g8cb27127e6dd #1 [ 2984.173046] Hardware name: Supermicro X8SIL/X8SIL, BIOS 1.0c 02/25/2010 [ 2984.173050] Workqueue: kcopyd do_work [ 2984.173057] RIP: 0010:copy_callback+0x146/0x190 [dm_snapshot] [ 2984.173058] Code: c0 48 8b 07 0f 94 c2 ff 50 30 48 8b 85 c8 00 00 00 49 39 c4 75 84 eb 45 48 8b ad d0 00 00 00 49 39 ec 74 0f 48 3b 45 f8 77 09 <48> 8b 6d 08 49 39 ec 75 f1 4c 8b 6d 00 4c 8d 63 58 48 89 ee 4c 89 [ 2984.173129] RSP: 0018:ffffa946c4a7bd30 EFLAGS: 00000293 ORIG_RAX: ffffffffffffffdd [ 2984.173132] RAX: 00000000001530c2 RBX: ffff910d046243c0 RCX: 000000000000000f [ 2984.173134] RDX: ffff910d04624300 RSI: 0000000000000000 RDI: 0000000000000000 [ 2984.173135] RBP: ffff910d15855c10 R08: 0000000000000000 R09: 0000000000000001 [ 2984.173137] R10: 0000000000000000 R11: 0000000000000000 R12: ffff910d8e8768c8 [ 2984.173138] R13: 0000000000000000 R14: 0000000000000000 R15: ffff910d046243c0 [ 2984.173140] FS: 0000000000000000(0000) GS:ffff910f3fc80000(0000) knlGS:0000000000000000 [ 2984.173142] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2984.173143] CR2: 00007f85b92bc4c0 CR3: 0000000047210000 CR4: 00000000000006e0 [ 2984.173145] Call Trace: [ 2984.173152] ? origin_resume+0x80/0x80 [dm_snapshot] [ 2984.173155] run_complete_job+0x75/0xa0 [ 2984.173159] ? client_free_pages+0x30/0x30 [ 2984.173162] process_jobs+0x41/0x230 [ 2984.173167] do_work+0x57/0xb0 [ 2984.173175] ? process_one_work+0x326/0x610 [ 2984.173177] process_one_work+0x326/0x610 [ 2984.173184] worker_thread+0x22e/0x390 [ 2984.173188] ? process_one_work+0x610/0x610 [ 2984.173191] kthread+0x120/0x130 [ 2984.173194] ? kthread_create_worker_on_cpu+0x70/0x70 [ 2984.173197] ret_from_fork+0x3a/0x50 [ 2997.298789] libceph: mon2 172.21.8.134:6790 session lost, hunting for new mon [ 2997.299080] rbd: rbd0: encountered watch error: -107 [ 2997.310849] libceph: mon1 172.21.8.134:6789 session established [ 3088.164135] watchdog: BUG: soft lockup - CPU#2 stuck for 23s! [kworker/2:9:1705] [ 3088.171708] Modules linked in: dm_snapshot dm_bufio xfs rbd libceph binfmt_misc gpio_ich intel_powerclamp ipmi_si ipmi_devintf coretemp serio_raw ipmi_msghandler i7core_edac joydev kvm_intel kvm lpc_ich irqbypass nfsd auth_rpcgss nfs_acl lockd grace sunrpc ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi lp parport autofs4 btrfs xor zstd_decompress zstd_compress xxhash raid6_pq libcrc32c hid_generic usbhid hid ahci psmouse libahci e1000e arcmsr [ 3088.171779] irq event stamp: 26955152 [ 3088.171784] hardirqs last enabled at (26955151): [<ffffffffa426ca19>] kmem_cache_free+0x109/0x2b0 [ 3088.171787] hardirqs last disabled at (26955152): [<ffffffffa4a00964>] interrupt_entry+0xc4/0xe0 [ 3088.171791] softirqs last enabled at (26051876): [<ffffffffa4c00455>] __do_softirq+0x455/0x491 [ 3088.171794] softirqs last disabled at (26051843): [<ffffffffa408fd3d>] irq_exit+0x5d/0xd0 [ 3088.171797] CPU: 2 PID: 1705 Comm: kworker/2:9 Tainted: G L 4.18.0-rc7-ceph-g8cb27127e6dd #1 [ 3088.171798] Hardware name: Supermicro X8SIL/X8SIL, BIOS 1.0c 02/25/2010 [ 3088.171803] Workqueue: kcopyd do_work [ 3088.171810] RIP: 0010:copy_callback+0x146/0x190 [dm_snapshot] [ 3088.171811] Code: c0 48 8b 07 0f 94 c2 ff 50 30 48 8b 85 c8 00 00 00 49 39 c4 75 84 eb 45 48 8b ad d0 00 00 00 49 39 ec 74 0f 48 3b 45 f8 77 09 <48> 8b 6d 08 49 39 ec 75 f1 4c 8b 6d 00 4c 8d 63 58 48 89 ee 4c 89 [ 3088.171882] RSP: 0018:ffffa946c4a7bd30 EFLAGS: 00000293 ORIG_RAX: ffffffffffffffdd [ 3088.171886] RAX: 00000000001b5bb4 RBX: ffff910e1ca81438 RCX: 000000000000000f [ 3088.171887] RDX: ffff910e1ca81400 RSI: 0000000000000000 RDI: 0000000000000000 [ 3088.171888] RBP: ffff910e3e66e058 R08: 0000000000000000 R09: 0000000000000001 [ 3088.171890] R10: 0000000000000000 R11: 0000000000000000 R12: ffff910d8e8768c8 [ 3088.171891] R13: 0000000000000000 R14: 0000000000000000 R15: ffff910e1ca81438 [ 3088.171893] FS: 0000000000000000(0000) GS:ffff910f3fc80000(0000) knlGS:0000000000000000 [ 3088.171895] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 3088.171896] CR2: 00007f85b92bc4c0 CR3: 0000000047210000 CR4: 00000000000006e0 [ 3088.171898] Call Trace: [ 3088.171905] ? origin_resume+0x80/0x80 [dm_snapshot] [ 3088.171909] run_complete_job+0x75/0xa0 [ 3088.171913] ? client_free_pages+0x30/0x30 [ 3088.171915] process_jobs+0x41/0x230 [ 3088.171921] do_work+0x57/0xb0 [ 3088.171929] ? process_one_work+0x326/0x610 [ 3088.171930] process_one_work+0x326/0x610 [ 3088.171937] worker_thread+0x22e/0x390 [ 3088.171942] ? process_one_work+0x610/0x610 [ 3088.171945] kthread+0x120/0x130 [ 3088.171948] ? kthread_create_worker_on_cpu+0x70/0x70 [ 3088.171951] ret_from_fork+0x3a/0x50 [ 3167.122446] XFS (dm-0): Unmounting Filesystem
_______________________________________________ ceph-users mailing list ceph-users@xxxxxxxxxxxxxx http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com