On Sat, Mar 5, 2016 at 3:57 PM, Nicholas A. Bellinger <nab@xxxxxxxxxxxxxxx> wrote: > On Sat, 2016-03-05 at 13:26 -0500, Dan Lane wrote: >> On Sat, Mar 5, 2016 at 1:01 AM, Nicholas A. Bellinger >> <nab@xxxxxxxxxxxxxxx> wrote: >> > On Fri, 2016-03-04 at 01:49 -0500, Dan Lane wrote: >> >> On Sun, Feb 28, 2016 at 4:02 PM, Nicholas A. Bellinger >> >> <nab@xxxxxxxxxxxxxxx> wrote: >> >> > On Sun, 2016-02-28 at 12:55 -0800, Nicholas A. Bellinger wrote: >> > >> > <SNIP> >> > >> >> > To reiterate again from: >> >> > >> >> > https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2113956 >> >> > >> >> > Symptoms: >> >> > >> >> > "An ESXi 5.5 Update 2 or ESXi 6.0 host loses connectivity to a VMFS5 >> >> > datastore." >> >> > >> >> > "Note: These symptoms are seen in connection with the use of VAAI ATS >> >> > heartbeat with storage arrays supplied by several different vendors." >> >> > >> >> > Cause: >> >> > >> >> > "A change in the VMFS heartbeat update method was introduced in ESXi 5.5 >> >> > Update 2, to help optimize the VMFS heartbeat process. Whereas the >> >> > legacy method involves plain SCSI reads and writes with the VMware ESXi >> >> > kernel handling validation, the new method offloads the validation step >> >> > to the storage system. This is similar to other VAAI-related offloads. >> >> > >> >> > This optimization results in a significant increase in the volume of ATS >> >> > commands the ESXi kernel issues to the storage system and resulting >> >> > increased load on the storage system. Under certain circumstances, VMFS >> >> > heartbeat using ATS may fail with false ATS miscompare which causes the >> >> > ESXi kernel to reverify its access to VMFS datastores. This leads to the >> >> > Lost access to datastore messages." >> >> > >> >> >> >> Nicholas: The problem isn't with the ATS "bug", in fact I don't have >> >> any mention of ATS anywhere in my vmkernel.log >> >> >> > >> > Yes, you are most certainly hitting ATS heartbeat timeouts. >> > >> > There is a reason why every major storage vendor says you MUST disable >> > ATS heartbeat with all VMFS5 mounts with ESX v5.5u2+, or completely >> > disable COMPARE_AND_WRITE all-together at target level. >> > >> > In a perfect world it would be nice to not have to make ESX host >> > changes, but we don't live in a perfect world! >> > >> >> [root@labhost4:/tmp/scratch/log] grep ATS vmkernel.log >> >> [root@labhost4:/tmp/scratch/log] zcat vmkernel.0.gz | grep ATS >> >> [root@labhost4:/tmp/scratch/log] zcat vmkernel.1.gz | grep ATS >> >> [root@labhost4:/tmp/scratch/log] zcat vmkernel.2.gz | grep ATS >> >> [root@labhost4:/tmp/scratch/log] zcat vmkernel.3.gz | grep ATS >> >> [root@labhost4:/tmp/scratch/log] zcat vmkernel.4.gz | grep ATS >> >> [root@labhost4:/tmp/scratch/log] >> >> >> > >> > That is not how you check for ATS heartbeats. See below. >> > >> >> Also, my friend David did disable ATS on his target server and the >> >> crash still occurred. >> >> I just got home a couple of hours ago so I >> >> haven't had a chance, but the above tells me that the problem is not >> >> related to ATS. Also, during this testing I only had one ESXi host >> >> turned on, which is where the logs are from. >> >> >> >> I just restarted the target server, and with pretty much zero load on >> >> the server I got this in messages on the target server: >> >> [ 275.145225] ABORT_TASK: Found referenced qla2xxx task_tag: 1184312 >> >> [ 275.145274] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 1184312 >> >> [ 312.412465] ABORT_TASK: Found referenced qla2xxx task_tag: 1176128 >> >> [ 312.412511] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 1176128 >> >> [ 313.413499] ABORT_TASK: Found referenced qla2xxx task_tag: 1219556 >> >> [ 318.729670] ABORT_TASK: Sending TMR_FUNCTION_COMPLETE for ref_tag: 1219556 >> >> [ 318.730244] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 1194652 >> >> [ 318.730737] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 1196720 >> >> [ 318.731215] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 1217708 >> >> [ 318.731658] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 1218896 >> >> [ 318.732111] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 1182024 >> >> [ 318.732531] ABORT_TASK: Sending TMR_TASK_DOES_NOT_EXIST for ref_tag: 1168032 >> >> [ 327.528277] ABORT_TASK: Found referenced qla2xxx task_tag: 1139300 >> >> >> >> See the attachment for the vmkernel.log from the same exact time >> >> period, it was too big for here. >> > >> > ATS is using SCSI command COMPARE_AND_WRITE = 0x89: >> > >> > # cat vmkernel-snip.log | grep 0x89 >> > 2016-03-04T06:29:01.597Z cpu15:33312)NMP: nmp_ThrottleLogForDevice:3178: Cmd 0x89 (0x43a185bfa5c0, 32790) to dev "naa.6001405a5ce03d8529a4341a454252a8" on path "vmhba1:C0:T1:L0" Failed: H:0x8 D:0x0 P:0x0 Possible sense data: 0x5 0x20 0x0. Act:EVAL >> > 2016-03-04T06:29:01.597Z cpu15:33312)ScsiDeviceIO: 2645: Cmd(0x43a185bfa5c0) 0x89, CmdSN 0x21f93 from world 32790 to dev "naa.6001405a5ce03d8529a4341a454252a8" failed H:0x8 D:0x0 P:0x0 Possible sense data: 0x5 0x20 0x0. >> > 2016-03-04T06:29:02.597Z cpu14:33312)ScsiDeviceIO: 2645: Cmd(0x43a186506240) 0x89, CmdSN 0x21f95 from world 32790 to dev "naa.6001405a5ce03d8529a4341a454252a8" failed H:0x8 D:0x0 P:0x0 Possible sense data: 0x5 0x20 0x0. >> > 2016-03-04T06:29:16.711Z cpu8:33312)ScsiDeviceIO: 2645: Cmd(0x439d8034b200) 0x89, CmdSN 0x21f97 from world 32825 to dev "naa.6001405a5ce03d8529a4341a454252a8" failed H:0x8 D:0x0 P:0x0 Possible sense data: 0x5 0x24 0x0. >> > 2016-03-04T06:29:29.744Z cpu8:33312)ScsiDeviceIO: 2645: Cmd(0x439d8550b3c0) 0x89, CmdSN 0x21f99 from world 32826 to dev "naa.6001405a5ce03d8529a4341a454252a8" failed H:0x8 D:0x0 P:0x0 Possible sense data: 0x5 0x24 0x0. >> > 2016-03-04T06:29:42.775Z cpu8:33312)ScsiDeviceIO: 2645: Cmd(0x43a180333700) 0x89, CmdSN 0x21f9b from world 32827 to dev "naa.6001405a5ce03d8529a4341a454252a8" failed H:0x8 D:0x0 P:0x0 Possible sense data: 0x5 0x20 0x0. >> > 2016-03-04T06:29:55.789Z cpu8:33312)ScsiDeviceIO: 2645: Cmd(0x439d87290fc0) 0x89, CmdSN 0x21f9d from world 32825 to dev "naa.6001405a5ce03d8529a4341a454252a8" failed H:0x8 D:0x0 P:0x0 Possible sense data: 0x5 0x20 0x0. >> > 2016-03-04T06:30:08.794Z cpu8:33312)ScsiDeviceIO: 2645: Cmd(0x439d8036eb80) 0x89, CmdSN 0x21f9f from world 32826 to dev "naa.6001405a5ce03d8529a4341a454252a8" failed H:0x8 D:0x0 P:0x0 Possible sense data: 0x5 0x24 0x0. >> > 2016-03-04T06:30:10.229Z cpu8:33312)NMP: nmp_ThrottleLogForDevice:3178: Cmd 0x89 (0x43a186f55480, 32827) to dev "naa.6001405a5ce03d8529a4341a454252a8" on path "vmhba1:C0:T1:L0" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x5 0x20 0x0. Act:EVAL >> > 2016-03-04T06:30:10.229Z cpu8:33312)ScsiDeviceIO: 2607: Cmd(0x43a186f55480) 0x89, CmdSN 0x21fa1 from world 32827 to dev "naa.6001405a5ce03d8529a4341a454252a8" failed H:0x2 D:0x0 P:0x0 Possible sense data: 0x5 0x20 0x0. >> > 2016-03-04T06:30:10.243Z cpu8:33312)NMP: nmp_ThrottleLogForDevice:3178: Cmd 0x89 (0x43a186f55480, 32827) to dev "naa.6001405a5ce03d8529a4341a454252a8" on path "vmhba1:C0:T1:L0" Failed: H:0xc D:0x0 P:0x0 Possible sense data: 0x5 0x20 0x0. Act:NONE >> > 2016-03-04T06:30:10.243Z cpu8:33312)ScsiDeviceIO: 2607: Cmd(0x43a186f55480) 0x89, CmdSN 0x21fa1 from world 32827 to dev "naa.6001405a5ce03d8529a4341a454252a8" failed H:0xc D:0x0 P:0x0 Possible sense data: 0x5 0x20 0x0. >> > 2016-03-04T06:30:19.854Z cpu10:33312)NMP: nmp_ThrottleLogForDevice:3178: Cmd 0x89 (0x43a186f55480, 32827) to dev "naa.6001405a5ce03d8529a4341a454252a8" on path "vmhba1:C0:T1:L0" Failed: H:0x1 D:0x0 P:0x0 Possible sense data: 0x5 0x20 0x0. Act:FAILOVER >> > 2016-03-04T06:30:22.029Z cpu15:33313)WARNING: NMP: nmpCompleteRetryForPath:352: Retry cmd 0x89 (0x43a186f55480) to dev "naa.6001405a5ce03d8529a4341a454252a8" failed on path "vmhba2:C0:T1:L0" H:0x8 D:0x0 P:0x0 Possible sense data: 0x5 0x20 0x0. >> > 2016-03-04T06:30:22.029Z cpu15:33313)ScsiDeviceIO: 2645: Cmd(0x43a186f55480) 0x89, CmdSN 0x21fa1 from world 32827 to dev "naa.6001405a5ce03d8529a4341a454252a8" failed H:0x8 D:0x0 P:0x0 Possible sense data: 0x5 0x20 0x0. >> > 2016-03-04T06:30:35.032Z cpu15:33313)NMP: nmp_ThrottleLogForDevice:3178: Cmd 0x89 (0x439d85181d40, 32825) to dev "naa.6001405a5ce03d8529a4341a454252a8" on path "vmhba2:C0:T1:L0" Failed: H:0x8 D:0x0 P:0x0 Possible sense data: 0x5 0x24 0x0. Act:EVAL >> > 2016-03-04T06:30:35.032Z cpu15:33313)ScsiDeviceIO: 2645: Cmd(0x439d85181d40) 0x89, CmdSN 0x21fa3 from world 32825 to dev "naa.6001405a5ce03d8529a4341a454252a8" failed H:0x8 D:0x0 P:0x0 Possible sense data: 0x5 0x24 0x0. >> > 2016-03-04T06:30:48.237Z cpu15:33313)ScsiDeviceIO: 2645: Cmd(0x439d855c33c0) 0x89, CmdSN 0x21fa5 from world 32826 to dev "naa.6001405a5ce03d8529a4341a454252a8" failed H:0x8 D:0x0 P:0x0 Possible sense data: 0x5 0x20 0x0. >> > 2016-03-04T06:31:01.442Z cpu15:33313)ScsiDeviceIO: 2645: Cmd(0x43a186a693c0) 0x89, CmdSN 0x21fa7 from world 32827 to dev "naa.6001405a5ce03d8529a4341a454252a8" failed H:0x8 D:0x0 P:0x0 Possible sense data: 0x5 0x20 0x0. >> > 2016-03-04T06:31:02.468Z cpu15:33313)NMP: nmp_ThrottleLogForDevice:3178: Cmd 0x89 (0x439d85181d40, 32825) to dev "naa.6001405a5ce03d8529a4341a454252a8" on path "vmhba2:C0:T1:L0" Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x5 0x24 0x0. Act:EVAL >> > 2016-03-04T06:31:02.468Z cpu15:33313)ScsiDeviceIO: 2607: Cmd(0x439d85181d40) 0x89, CmdSN 0x21fa9 from world 32825 to dev "naa.6001405a5ce03d8529a4341a454252a8" failed H:0x2 D:0x0 P:0x0 Possible sense data: 0x5 0x24 0x0. >> > 2016-03-04T06:31:02.483Z cpu15:33313)NMP: nmp_ThrottleLogForDevice:3178: Cmd 0x89 (0x439d85181d40, 32825) to dev "naa.6001405a5ce03d8529a4341a454252a8" on path "vmhba2:C0:T1:L0" Failed: H:0xc D:0x0 P:0x0 Possible sense data: 0x5 0x24 0x0. Act:NONE >> > 2016-03-04T06:31:02.483Z cpu15:33313)ScsiDeviceIO: 2607: Cmd(0x439d85181d40) 0x89, CmdSN 0x21fa9 from world 32825 to dev "naa.6001405a5ce03d8529a4341a454252a8" failed H:0xc D:0x0 P:0x0 Possible sense data: 0x5 0x24 0x0. >> > 2016-03-04T06:31:14.444Z cpu6:33061)ScsiDeviceIO: 2645: Cmd(0x439d85181d40) 0x89, CmdSN 0x21fa9 from world 32825 to dev "naa.6001405a5ce03d8529a4341a454252a8" failed H:0x5 D:0x0 P:0x0 Possible sense data: 0x5 0x24 0x0. >> > >> > And yes, the ATS heartbeat timeout corresponds to internal >> > COMPARE_AND_WRITE failures: >> > >> > # cat vmkernel-snip.log | grep " HB" >> > 2016-03-04T06:27:59.950Z cpu11:32827)HBX: 276: 'dracofiler-lun0': HB at offset 3956736 - Reclaimed heartbeat [Timeout]: >> > 2016-03-04T06:30:21.781Z cpu2:33924)HBX: 2801: 'dracofiler-lun0': HB at offset 3956736 - Waiting for timed out HB: >> > 2016-03-04T06:30:22.029Z cpu2:33924)HBX: 2801: 'dracofiler-lun0': HB at offset 3956736 - Waiting for timed out HB: >> > 2016-03-04T06:30:35.234Z cpu2:33924)HBX: 2801: 'dracofiler-lun0': HB at offset 3956736 - Waiting for timed out HB: >> > 2016-03-04T06:30:48.439Z cpu2:33924)HBX: 2801: 'dracofiler-lun0': HB at offset 3956736 - Waiting for timed out HB: >> > 2016-03-04T06:31:01.442Z cpu2:33924)HBX: 2801: 'dracofiler-lun0': HB at offset 3956736 - Waiting for timed out HB: >> > 2016-03-04T06:31:14.444Z cpu2:33924)HBX: 2801: 'dracofiler-lun0': HB at offset 3956736 - Waiting for timed out HB: >> > >> > The sense ASQ codes above for failed COMPARE_AND_WRITE = 0x89 commands are: >> > >> > 0x20 = /* INVALID COMMAND OPERATION CODE */ >> > 0x24 = /* INVALID FIELD IN CDB */ >> > >> > which confirms ATS commands are being internally aborted on your ESX >> > v5.5u2+ host side config, resulting in repeated ATS heartbeat timeouts >> > until ESX takes the device offline. >> > >> >> >> >> This is the point where I can no longer control the service on the >> >> target, running service target stop results in the aforementioned hung >> >> task, here is the output of "cat /proc/$PID/stack" after I try >> >> stopping the hung task: >> >> [root@dracofiler ~]# cat /proc/1911/stack >> >> [<ffffffffa053c0ee>] tcm_qla2xxx_tpg_enable_store+0xde/0x1a0 [tcm_qla2xxx] >> >> [<ffffffff812b8b7a>] configfs_write_file+0x9a/0x100 >> >> [<ffffffff81234967>] __vfs_write+0x37/0x120 >> >> [<ffffffff81235289>] vfs_write+0xa9/0x1a0 >> >> [<ffffffff812361b5>] SyS_write+0x55/0xc0 >> >> [<ffffffff817aa56e>] entry_SYSCALL_64_fastpath+0x12/0x71 >> >> [<ffffffffffffffff>] 0xffffffffffffffff >> >> >> >> It's been almost a week since I worked on this, so please forgive me >> >> if I missed one of your suggestions for something to try or request >> >> for information that I missed. Just let me know what it is and I'll >> >> do it. >> > >> > To reiterate, the hung task shutdown bug you are triggering with Fedora >> > -fb is separate from the ESX ATS heartbeat issues detailed above. >> > >> > So first things first for ESX. Let's finally see how ATS w/o ATS >> > heartbeat works for your VMFS5 mounts to avoid generating constant ATS >> > heartbeat timeouts + ABORT_TASKs. >> > >> > Otherwise you need to explicitly set emulate_caw=0 to enforce pre-ATS >> > operation for all backend devices in order to get a working ESX v5.5u2+ >> > environment, if you can't change ESX host configs. >> > >> >> SUCCESS!!! >> >> It's limited success, I don't think we're totally out of the woods >> yet, but you were right about the ATS stuff, disabling it has made my >> storage stable. >> > > To confirm, did you verify with option #1 host side ATS heartbeat > disabled, or option #2 with target side emulate_caw=0..? > >> I think the critical thing for me was understanding that the ATS bug >> and the hung task shutdown bug are two separate problems. In the past >> the two seemed directly linked, since the service only seems to stop >> responding to control commands after the host had dropped the path, >> might just be a coincidence though. >> >> I do still have some errors that I would like to discuss: >> >> In the target server messages log: >> Mar 5 12:45:11 dracofiler kernel: ABORT_TASK: Found referenced >> qla2xxx task_tag: 1194916 >> Mar 5 12:45:11 dracofiler kernel: ABORT_TASK: Sending >> TMR_TASK_DOES_NOT_EXIST for ref_tag: 1194916 >> (I get a group 3-5 of these errors about every 5 minutes) >> > > As we verified earlier, your qla2xxx FC stats counters look at expected > target side. Have you looked at ESX host side FC stats counters yet..? > > If those look OK too, then I'd recommend following the previous > discussion to reduce ESX host side qla2xxx LLD queue depth and/or > increase I/O timeouts, and see if that has any effect on avoiding these > constant ABORT_TASK timeouts. > >> in the ESXi vmkernel.log: >> 2016-03-05T17:44:30.838Z cpu9:33312)WARNING: NMP: >> nmp_DeviceRequestFastDeviceProbe:237: NMP device >> "naa.6001405a5ce03d8529a4341a454252a8" state in doubt; requested fast >> path state update... >> 2016-03-05T17:44:30.838Z cpu9:33312)ScsiDeviceIO: 2645: >> Cmd(0x439d8746d7c0) 0x12, CmdSN 0x29181 from world 0 to dev >> "naa.6001405a5ce03d8529a4341a454252a8" failed H:0x8 D:0x0 P:0x0 >> Possible sense data: 0x5 0x24 0x0. >> 2016-03-05T17:44:51.278Z cpu9:33312)NMP: >> nmp_ThrottleLogForDevice:3178: Cmd 0x12 (0x439d8746d7c0, 0) to dev >> "naa.6001405a5ce03d8529a4341a454252a8" on path "vmhba1:C0:T1:L0" >> Failed: H:0x2 D:0x0 P:0x0 Possible sense data: 0x5 0x24 0x0. Act:EVAL >> (These errors seem to come at about the same interval as the ABORT_TASK errors) >> 2016-03-05T16:15:55.803Z cpu15:33057)NMP: >> nmp_ThrottleLogForDevice:3178: Cmd 0x9e (0x43a180365300, 0) to dev >> "naa.500000e015072570" on path "vmhba0:C0:T0:L0" Failed: H:0x0 D:0x2 >> P:0x0 Valid sense data: 0x5 0x20 0x0. Act:NONE >> (these are much less common than the 0x12 command failures) >> >> Most importantly of course is that the storage is NOT becoming >> unavailable during these events! >> > > This indicates the number of internally failed SCSI commands for ESX > is now lower without all the extra COMPARE_AND_WRITE commands for > ATS heartbeat, and below ESX's omterma; threshold for retries before > taking the device offline. > > Can the device still be taken offline (eventually) due to constant > ABORT_TASKs, even without ATS heartbeat or COMPARE_AND_WRITE completely > disabled..? Most certainly, if ESX's SCSI I/O retry limit is reached > for the same command. > >> If I understand correctly, these map to INQUIRY and SERVICE ACTION >> IN(16). I don't know if these are critical or not, but seeing the >> "state in doubt; requested fast path state update" message concerns >> me. >> > > INQUIRY is for querying LUN metadata, and SERVICE_ACTION_IN(16) is most > likely a SAI_READ_CAPACITY_16 query. > > Both of these commands not effected by backend I/O performance from the > target perspective, which further indicates some manner of ESX SCSI host > side issue to debug. > Unfortunately it looks like target caused a kernel oops before I had a chance to test those. Please see the attached crash log. The one thing I can answer is that I used "option #1 host side ATS heartbeat disabled". Dan
Mar 5 15:25:11 dracofiler kernel: ------------[ cut here ]------------ Mar 5 15:25:11 dracofiler kernel: WARNING: CPU: 4 PID: 2255 at lib/list_debug.c:71 __list_del_entry+0xbb/0xc0() Mar 5 15:25:11 dracofiler kernel: list_del corruption. next->prev should be ffff880624cd9830, but was ffff880624a1f4d0 Mar 5 15:25:11 dracofiler kernel: Modules linked in: tcm_qla2xxx target_core_user uio target_core_pscsi target_core_file target_core_iblock iscsi_target_mod target_core_mod coretemp kvm_intel kvm irqbypass snd_hda_codec_realtek crct10dif_pclmul snd_hda_codec_generic snd_hda_intel snd_hda_codec crc32_pclmul ses ghash_clmulni_intel snd_hda_core snd_hwdep gpio_ich enclosure iTCO_wdt iTCO_vendor_support joydev snd_seq scsi_transport_sas pcspkr snd_seq_device i5500_temp tpm_infineon snd_pcm shpchp snd_timer acpi_cpufreq snd ioatdma lpc_ich soundcore i2c_i801 i7core_edac edac_core tpm_tis tpm nfsd nfs_acl lockd grace auth_rpcgss sunrpc xfs libcrc32c ata_generic pata_acpi mgag200 drm_kms_helper qla2xxx ttm drm igb firewire_ohci crc32c_intel ptp firewire_core serio_raw megaraid_sas scsi_transport_fc pps_core crc_itu_t dca pata_jmicron Mar 5 15:25:11 dracofiler kernel: fjes i2c_algo_bit Mar 5 15:25:11 dracofiler kernel: CPU: 4 PID: 2255 Comm: kworker/4:1 Tainted: G I 4.5.0-rc5+ #2 Mar 5 15:25:11 dracofiler kernel: Hardware name: iXsystems iX2224-E16R1200LPB/2.5/X8DAH, BIOS 2.1 12/30/2011 Mar 5 15:25:11 dracofiler kernel: Workqueue: tcm_qla2xxx_free tcm_qla2xxx_complete_free [tcm_qla2xxx] Mar 5 15:25:11 dracofiler kernel: 0000000000000086 000000009e22f9f0 ffff88062fd6fd68 ffffffff813c1e0e Mar 5 15:25:11 dracofiler kernel: ffff88062fd6fdb0 ffffffff81a9db7d ffff88062fd6fda0 ffffffff810a33b2 Mar 5 15:25:11 dracofiler kernel: ffff880631b09c80 ffff88063fa165c0 ffffe8ffffa03700 ffff880624cd9830 Mar 5 15:25:11 dracofiler kernel: Call Trace: Mar 5 15:25:11 dracofiler kernel: [<ffffffff813c1e0e>] dump_stack+0x63/0x85 Mar 5 15:25:11 dracofiler kernel: [<ffffffff810a33b2>] warn_slowpath_common+0x82/0xc0 Mar 5 15:25:11 dracofiler kernel: [<ffffffff810a344c>] warn_slowpath_fmt+0x5c/0x80 Mar 5 15:25:11 dracofiler kernel: [<ffffffffa065d2ae>] ? transport_generic_free_cmd+0x4e/0x140 [target_core_mod] Mar 5 15:25:11 dracofiler kernel: [<ffffffff813df0cb>] __list_del_entry+0xbb/0xc0 Mar 5 15:25:11 dracofiler kernel: [<ffffffff810bb657>] process_one_work+0xd7/0x430 Mar 5 15:25:11 dracofiler kernel: [<ffffffff810bb9fe>] worker_thread+0x4e/0x480 Mar 5 15:25:11 dracofiler kernel: [<ffffffff817a5cf7>] ? __schedule+0x3a7/0xa00 Mar 5 15:25:11 dracofiler kernel: [<ffffffff810bb9b0>] ? process_one_work+0x430/0x430 Mar 5 15:25:11 dracofiler kernel: [<ffffffff810bb9b0>] ? process_one_work+0x430/0x430 Mar 5 15:25:11 dracofiler kernel: [<ffffffff810c1388>] kthread+0xd8/0xf0 Mar 5 15:25:11 dracofiler kernel: [<ffffffff810c12b0>] ? kthread_worker_fn+0x170/0x170 Mar 5 15:25:11 dracofiler kernel: [<ffffffff817aa8cf>] ret_from_fork+0x3f/0x70 Mar 5 15:25:11 dracofiler kernel: [<ffffffff810c12b0>] ? kthread_worker_fn+0x170/0x170 Mar 5 15:25:11 dracofiler kernel: ---[ end trace 1b179ad71c41d908 ]--- Mar 5 15:25:11 dracofiler kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 Mar 5 15:25:11 dracofiler kernel: IP: [<ffffffff810bb5b5>] process_one_work+0x35/0x430 Mar 5 15:25:11 dracofiler kernel: PGD 0 Mar 5 15:25:11 dracofiler kernel: Oops: 0000 [#1] SMP Mar 5 15:25:11 dracofiler kernel: Modules linked in: tcm_qla2xxx target_core_user uio target_core_pscsi target_core_file target_core_iblock iscsi_target_mod target_core_mod coretemp kvm_intel kvm irqbypass snd_hda_codec_realtek crct10dif_pclmul snd_hda_codec_generic snd_hda_intel snd_hda_codec crc32_pclmul ses ghash_clmulni_intel snd_hda_core snd_hwdep gpio_ich enclosure iTCO_wdt iTCO_vendor_support joydev snd_seq scsi_transport_sas pcspkr snd_seq_device i5500_temp tpm_infineon snd_pcm shpchp snd_timer acpi_cpufreq snd ioatdma lpc_ich soundcore i2c_i801 i7core_edac edac_core tpm_tis tpm nfsd nfs_acl lockd grace auth_rpcgss sunrpc xfs libcrc32c ata_generic pata_acpi mgag200 drm_kms_helper qla2xxx ttm drm igb firewire_ohci crc32c_intel ptp firewire_core serio_raw megaraid_sas scsi_transport_fc pps_core crc_itu_t dca pata_jmicron Mar 5 15:25:11 dracofiler kernel: fjes i2c_algo_bit Mar 5 15:25:11 dracofiler kernel: CPU: 4 PID: 2255 Comm: kworker/4:1 Tainted: G W I 4.5.0-rc5+ #2 Mar 5 15:25:11 dracofiler kernel: Hardware name: iXsystems iX2224-E16R1200LPB/2.5/X8DAH, BIOS 2.1 12/30/2011 Mar 5 15:25:11 dracofiler kernel: task: ffff880630f43880 ti: ffff88062fd6c000 task.ti: ffff88062fd6c000 Mar 5 15:25:11 dracofiler kernel: RIP: 0010:[<ffffffff810bb5b5>] [<ffffffff810bb5b5>] process_one_work+0x35/0x430 Mar 5 15:25:11 dracofiler kernel: RSP: 0018:ffff88062fd6fe28 EFLAGS: 00010046 Mar 5 15:25:11 dracofiler kernel: RAX: 0000000000000000 RBX: ffff880631b09c80 RCX: ffffe8ffffa03760 Mar 5 15:25:11 dracofiler kernel: RDX: 00000001029bb396 RSI: ffff880624cd9828 RDI: ffff880631b09c80 Mar 5 15:25:11 dracofiler kernel: RBP: ffff88062fd6fe58 R08: ffffffffa0231a20 R09: ffff880624cd9620 Mar 5 15:25:11 dracofiler kernel: R10: ffff88032de9ec80 R11: 00000000000047ac R12: ffff88063fa165c0 Mar 5 15:25:11 dracofiler kernel: R13: 0000000000000000 R14: ffff88063fa165e0 R15: ffff880624cd9828 Mar 5 15:25:11 dracofiler kernel: FS: 0000000000000000(0000) GS:ffff88063fa00000(0000) knlGS:0000000000000000 Mar 5 15:25:11 dracofiler kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Mar 5 15:25:11 dracofiler kernel: CR2: 00000000000000b8 CR3: 0000000001c0a000 CR4: 00000000000006e0 Mar 5 15:25:11 dracofiler kernel: Stack: Mar 5 15:25:11 dracofiler kernel: 0000000000000000 ffff88063fa165c0 ffff880631b09cb0 0000000000000008 Mar 5 15:25:11 dracofiler kernel: ffff88063fa165e0 ffff880631b09c80 ffff88062fd6fec0 ffffffff810bb9fe Mar 5 15:25:11 dracofiler kernel: ffffffff817a5cf7 ffff880630f43880 ffff880630f43880 ffff880630f43880 Mar 5 15:25:11 dracofiler kernel: Call Trace: Mar 5 15:25:11 dracofiler kernel: [<ffffffff810bb9fe>] worker_thread+0x4e/0x480 Mar 5 15:25:11 dracofiler kernel: [<ffffffff817a5cf7>] ? __schedule+0x3a7/0xa00 Mar 5 15:25:11 dracofiler kernel: [<ffffffff810bb9b0>] ? process_one_work+0x430/0x430 Mar 5 15:25:11 dracofiler kernel: [<ffffffff810bb9b0>] ? process_one_work+0x430/0x430 Mar 5 15:25:11 dracofiler kernel: [<ffffffff810c1388>] kthread+0xd8/0xf0 Mar 5 15:25:11 dracofiler kernel: [<ffffffff810c12b0>] ? kthread_worker_fn+0x170/0x170 Mar 5 15:25:11 dracofiler kernel: [<ffffffff817aa8cf>] ret_from_fork+0x3f/0x70 Mar 5 15:25:11 dracofiler kernel: [<ffffffff810c12b0>] ? kthread_worker_fn+0x170/0x170 Mar 5 15:25:11 dracofiler kernel: Code: 57 41 56 41 55 41 54 49 89 f7 53 48 89 fb 48 83 ec 08 48 8b 06 4c 8b 67 48 49 89 c5 41 80 e5 00 a8 04 b8 00 00 00 00 4c 0f 44 e8 <49> 8b 45 08 44 8b b0 00 01 00 00 41 83 e6 20 41 f6 44 24 10 04 Mar 5 15:25:11 dracofiler kernel: RIP [<ffffffff810bb5b5>] process_one_work+0x35/0x430 Mar 5 15:25:11 dracofiler kernel: RSP <ffff88062fd6fe28> Mar 5 15:25:11 dracofiler kernel: CR2: 0000000000000008 Mar 5 15:25:11 dracofiler kernel: ---[ end trace 1b179ad71c41d909 ]--- Message from syslogd@dracofiler at Mar 5 15:26:11 ... kernel:NMI watchdog: Watchdog detected hard LOCKUP on cpu 4 Mar 5 15:26:11 dracofiler kernel: NMI watchdog: Watchdog detected hard LOCKUP on cpu 4 Mar 5 15:26:11 dracofiler kernel: Modules linked in: Mar 5 15:26:11 dracofiler kernel: tcm_qla2xxx target_core_user uio target_core_pscsi target_core_file target_core_iblock iscsi_target_mod target_core_mod coretemp kvm_intel kvm irqbypass snd_hda_codec_realtek crct10dif_pclmul snd_hda_codec_generic snd_hda_intel snd_hda_codec crc32_pclmul ses ghash_clmulni_intel snd_hda_core snd_hwdep gpio_ich enclosure iTCO_wdt iTCO_vendor_support joydev snd_seq scsi_transport_sas pcspkr snd_seq_device i5500_temp tpm_infineon snd_pcm shpchp snd_timer acpi_cpufreq snd ioatdma lpc_ich soundcore i2c_i801 i7core_edac edac_core tpm_tis tpm nfsd nfs_acl lockd grace auth_rpcgss sunrpc xfs libcrc32c ata_generic pata_acpi mgag200 drm_kms_helper qla2xxx ttm drm igb firewire_ohci crc32c_intel ptp firewire_core serio_raw megaraid_sas scsi_transport_fc pps_core crc_itu_t dca pata_jmicron fjes i2c_algo_bit Mar 5 15:26:11 dracofiler kernel: Mar 5 15:26:11 dracofiler kernel: CPU: 4 PID: 2255 Comm: kworker/4:1 Tainted: G D W I 4.5.0-rc5+ #2 Mar 5 15:26:11 dracofiler kernel: Hardware name: iXsystems iX2224-E16R1200LPB/2.5/X8DAH, BIOS 2.1 12/30/2011 Mar 5 15:26:11 dracofiler kernel: task: ffff880630f43880 ti: ffff88062fd6c000 task.ti: ffff88062fd6c000 Mar 5 15:26:11 dracofiler kernel: RIP: 0010:[<ffffffff810a7c28>] [<ffffffff810a7c28>] __do_softirq+0x78/0x2d0 Mar 5 15:26:11 dracofiler kernel: RSP: 0018:ffff88063fa03f28 EFLAGS: 00000286 Mar 5 15:26:11 dracofiler kernel: RAX: ffff88062fd70000 RBX: 0000000000000000 RCX: 0000000000000020 Mar 5 15:26:11 dracofiler kernel: RDX: 002de5cd7320f440 RSI: 00000000000005e1 RDI: ffff880630f43880 Mar 5 15:26:11 dracofiler kernel: RBP: ffff88063fa03f78 R08: ffff88062df4e800 R09: 0000000000000004 Mar 5 15:26:11 dracofiler kernel: R10: 0000000000000005 R11: 0000000000000020 R12: 0000000000000009 Mar 5 15:26:11 dracofiler kernel: R13: ffff88062b403300 R14: ffff880630f43880 R15: 0000000000000008 Mar 5 15:26:11 dracofiler kernel: FS: 0000000000000000(0000) GS:ffff88063fa00000(0000) knlGS:0000000000000000 Mar 5 15:26:11 dracofiler kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Mar 5 15:26:11 dracofiler kernel: CR2: 00000000000000b8 CR3: 0000000001c0a000 CR4: 00000000000006e0 Mar 5 15:26:11 dracofiler kernel: Stack: Mar 5 15:26:11 dracofiler kernel: 042080643fa0f128 ffff88062fd70000 00000001029bb3b5 000028120000000a Mar 5 15:26:11 dracofiler kernel: 000002023fa0f098 0000000000000000 0000000000000009 ffff88062b403300 Mar 5 15:26:11 dracofiler kernel: ffff880630f43880 0000000000000008 ffff88063fa03f90 ffffffff810a8082 Mar 5 15:26:11 dracofiler kernel: Call Trace: Mar 5 15:26:11 dracofiler kernel: <IRQ> Mar 5 15:26:11 dracofiler kernel: [<ffffffff810a8082>] irq_exit+0x102/0x110 Mar 5 15:26:11 dracofiler kernel: [<ffffffff817ad222>] smp_apic_timer_interrupt+0x42/0x50 Mar 5 15:26:11 dracofiler kernel: [<ffffffff817ab35c>] apic_timer_interrupt+0x8c/0xa0 Mar 5 15:26:11 dracofiler kernel: <EOI> Mar 5 15:26:11 dracofiler kernel: [<ffffffff8112af54>] ? acct_collect+0x184/0x1d0 Mar 5 15:26:11 dracofiler kernel: [<ffffffff810a6084>] do_exit+0x4f4/0xb30 Mar 5 15:26:11 dracofiler kernel: [<ffffffff810198aa>] oops_end+0x9a/0xd0 Mar 5 15:26:11 dracofiler kernel: [<ffffffff81066b95>] no_context+0x135/0x380 Mar 5 15:26:11 dracofiler kernel: [<ffffffff810f89bc>] ? console_unlock+0x20c/0x540 Mar 5 15:26:11 dracofiler kernel: [<ffffffff81066e60>] __bad_area_nosemaphore+0x80/0x1f0 Mar 5 15:26:11 dracofiler kernel: [<ffffffff81066fe3>] bad_area_nosemaphore+0x13/0x20 Mar 5 15:26:11 dracofiler kernel: [<ffffffff810672a7>] __do_page_fault+0xb7/0x400 Mar 5 15:26:11 dracofiler kernel: [<ffffffff811a95b3>] ? printk+0x57/0x73 Mar 5 15:26:11 dracofiler kernel: [<ffffffff8106761f>] do_page_fault+0x2f/0x80 Mar 5 15:26:11 dracofiler kernel: [<ffffffff817ac908>] page_fault+0x28/0x30 Mar 5 15:26:11 dracofiler kernel: [<ffffffff810bb5b5>] ? process_one_work+0x35/0x430 Mar 5 15:26:11 dracofiler kernel: [<ffffffff810bb75f>] ? process_one_work+0x1df/0x430 Mar 5 15:26:11 dracofiler kernel: [<ffffffff810bb9fe>] worker_thread+0x4e/0x480 Mar 5 15:26:11 dracofiler kernel: [<ffffffff817a5cf7>] ? __schedule+0x3a7/0xa00 Mar 5 15:26:11 dracofiler kernel: [<ffffffff810bb9b0>] ? process_one_work+0x430/0x430 Mar 5 15:26:11 dracofiler kernel: [<ffffffff810bb9b0>] ? process_one_work+0x430/0x430 Mar 5 15:26:11 dracofiler kernel: [<ffffffff810c1388>] kthread+0xd8/0xf0 Mar 5 15:26:11 dracofiler kernel: [<ffffffff810c12b0>] ? kthread_worker_fn+0x170/0x170 Mar 5 15:26:11 dracofiler kernel: [<ffffffff817aa8cf>] ret_from_fork+0x3f/0x70 Mar 5 15:26:11 dracofiler kernel: [<ffffffff810c12b0>] ? kthread_worker_fn+0x170/0x170 Mar 5 15:26:11 dracofiler kernel: Code: 4b 40 f6 7e 00 01 00 00 65 48 8b 04 25 c4 42 01 00 c7 45 c8 0a 00 00 00 48 89 45 b8 65 c7 05 dc c5 f6 7e 00 00 00 00 fb 66 66 90 <66> 66 90 b8 ff ff ff ff 49 c7 c4 c0 90 c0 81 0f bc 45 d4 83 c0 Mar 5 15:26:11 dracofiler kernel: INFO: rcu_sched detected stalls on CPUs/tasks: Mar 5 15:26:11 dracofiler kernel: #0114-...: (1 GPs behind) idle=705/140000000000002/0 softirq=71174/71175 fqs=19796 Mar 5 15:26:11 dracofiler kernel: #011(detected by 7, t=60004 jiffies, g=33204, c=33203, q=0) Mar 5 15:26:11 dracofiler kernel: Task dump for CPU 4: Mar 5 15:26:11 dracofiler kernel: kworker/4:1 R running task 0 2255 2 0x00000008 Mar 5 15:26:11 dracofiler kernel: 0000000000000010 0000000000010046 ffff88062fd6fe28 0000000000000018 Mar 5 15:26:11 dracofiler kernel: ffffffff810bb75f 0000000000000000 ffff88063fa165c0 ffff880631b09cb0 Mar 5 15:26:11 dracofiler kernel: 0000000000000008 ffff88063fa165e0 ffff880631b09c80 ffff88062fd6fec0 Mar 5 15:26:11 dracofiler kernel: Call Trace: Mar 5 15:26:11 dracofiler kernel: [<ffffffff810bb75f>] ? process_one_work+0x1df/0x430 Mar 5 15:26:11 dracofiler kernel: [<ffffffff810bb9fe>] ? worker_thread+0x4e/0x480 Mar 5 15:26:11 dracofiler kernel: [<ffffffff817a5cf7>] ? __schedule+0x3a7/0xa00 Mar 5 15:26:11 dracofiler kernel: [<ffffffff810bb9b0>] ? process_one_work+0x430/0x430 Mar 5 15:26:11 dracofiler kernel: [<ffffffff810bb9b0>] ? process_one_work+0x430/0x430 Mar 5 15:26:11 dracofiler kernel: [<ffffffff810c1388>] ? kthread+0xd8/0xf0 Mar 5 15:26:11 dracofiler kernel: [<ffffffff810c12b0>] ? kthread_worker_fn+0x170/0x170 Mar 5 15:26:11 dracofiler kernel: [<ffffffff817aa8cf>] ? ret_from_fork+0x3f/0x70 Mar 5 15:26:11 dracofiler kernel: [<ffffffff810c12b0>] ? kthread_worker_fn+0x170/0x170 Mar 5 15:28:25 dracofiler systemd: systemd-udevd.service: Watchdog timeout (limit 3min)! Mar 5 15:28:25 dracofiler audit: ANOM_ABEND auid=4294967295 uid=0 gid=0 ses=4294967295 subj=system_u:system_r:udev_t:s0-s0:c0.c1023 pid=1318 comm="systemd-udevd" exe="/usr/lib/systemd/systemd-udevd" sig=6 Mar 5 15:28:25 dracofiler abrt-hook-ccpp: Process 1318 (systemd-udevd) of user 0 killed by SIGABRT - dumping core Mar 5 15:28:25 dracofiler audit: AVC avc: denied { getattr } for pid=2578 comm="abrt-hook-ccpp" path="ipc:[4026531839]" dev="nsfs" ino=4026531839 scontext=system_u:system_r:abrt_dump_oops_t:s0 tcontext=system_u:object_r:nsfs_t:s0 tclass=file permissive=1 Mar 5 15:28:29 dracofiler abrt-server: Deleting problem directory ccpp-2016-03-05-15:28:25-1318 (dup of ccpp-2016-01-23-14:34:49-798) Mar 5 15:28:29 dracofiler abrt-server: No actions are found for event 'notify-dup' Message from syslogd@dracofiler at Mar 5 15:29:11 ... kernel:NMI watchdog: Watchdog detected hard LOCKUP on cpu 6 Mar 5 15:29:11 dracofiler kernel: NMI watchdog: Watchdog detected hard LOCKUP on cpu 6 Mar 5 15:29:11 dracofiler kernel: Modules linked in: Mar 5 15:29:11 dracofiler kernel: tcm_qla2xxx target_core_user uio target_core_pscsi target_core_file target_core_iblock iscsi_target_mod target_core_mod coretemp kvm_intel kvm irqbypass snd_hda_codec_realtek crct10dif_pclmul snd_hda_codec_generic snd_hda_intel snd_hda_codec crc32_pclmul ses ghash_clmulni_intel snd_hda_core snd_hwdep gpio_ich enclosure iTCO_wdt iTCO_vendor_support joydev snd_seq scsi_transport_sas pcspkr snd_seq_device i5500_temp tpm_infineon snd_pcm shpchp snd_timer acpi_cpufreq snd ioatdma lpc_ich soundcore i2c_i801 i7core_edac edac_core tpm_tis tpm nfsd nfs_acl lockd grace auth_rpcgss sunrpc xfs libcrc32c ata_generic pata_acpi mgag200 drm_kms_helper qla2xxx ttm drm igb firewire_ohci crc32c_intel ptp firewire_core serio_raw megaraid_sas scsi_transport_fc pps_core crc_itu_t dca pata_jmicron fjes i2c_algo_bit Mar 5 15:29:11 dracofiler kernel: Mar 5 15:29:11 dracofiler kernel: CPU: 6 PID: 0 Comm: swapper/6 Tainted: G D W I 4.5.0-rc5+ #2 Mar 5 15:29:11 dracofiler kernel: Hardware name: iXsystems iX2224-E16R1200LPB/2.5/X8DAH, BIOS 2.1 12/30/2011 Mar 5 15:29:11 dracofiler kernel: 0000000000000086 7701bdc7df4f6012 ffff88063fa85b90 ffffffff813c1e0e Mar 5 15:29:11 dracofiler kernel: 0000000000000000 0000000000000001 ffff88063fa85ba8 ffffffff81158a40 Mar 5 15:29:11 dracofiler kernel: ffff880631be0000 ffff88063fa85be0 ffffffff811a0cfc 0000000000000001 Mar 5 15:29:11 dracofiler kernel: Call Trace: Mar 5 15:29:11 dracofiler kernel: <NMI> [<ffffffff813c1e0e>] dump_stack+0x63/0x85 Mar 5 15:29:11 dracofiler kernel: [<ffffffff81158a40>] watchdog_overflow_callback+0xe0/0xf0 Mar 5 15:29:11 dracofiler kernel: [<ffffffff811a0cfc>] __perf_event_overflow+0x8c/0x1d0 Mar 5 15:29:11 dracofiler kernel: [<ffffffff811a18d4>] perf_event_overflow+0x14/0x20 Mar 5 15:29:11 dracofiler kernel: [<ffffffff81035081>] intel_pmu_handle_irq+0x1e1/0x460 Mar 5 15:29:11 dracofiler kernel: [<ffffffff811f0ccd>] ? vunmap_page_range+0x20d/0x330 Mar 5 15:29:11 dracofiler kernel: [<ffffffff811f12d1>] ? unmap_kernel_range_noflush+0x11/0x20 Mar 5 15:29:11 dracofiler kernel: [<ffffffff81481c4f>] ? ghes_copy_tofrom_phys+0x10f/0x2a0 Mar 5 15:29:11 dracofiler kernel: [<ffffffff81481e78>] ? ghes_read_estatus+0x98/0x170 Mar 5 15:29:11 dracofiler kernel: [<ffffffff8102b628>] perf_event_nmi_handler+0x28/0x50 Mar 5 15:29:11 dracofiler kernel: [<ffffffff8101a059>] nmi_handle+0x69/0x120 Mar 5 15:29:11 dracofiler kernel: [<ffffffff8101a5f4>] default_do_nmi+0x44/0x100 Mar 5 15:29:11 dracofiler kernel: [<ffffffff8101a792>] do_nmi+0xe2/0x130 Mar 5 15:29:11 dracofiler kernel: [<ffffffff817acc71>] end_repeat_nmi+0x1a/0x1e Mar 5 15:29:11 dracofiler kernel: [<ffffffff810eb9f7>] ? queued_spin_lock_slowpath+0x127/0x190 Mar 5 15:29:11 dracofiler kernel: [<ffffffff810eb9f7>] ? queued_spin_lock_slowpath+0x127/0x190 Mar 5 15:29:11 dracofiler kernel: [<ffffffff810eb9f7>] ? queued_spin_lock_slowpath+0x127/0x190 Mar 5 15:29:11 dracofiler kernel: <<EOE>> <IRQ> [<ffffffff817aa1f0>] _raw_spin_lock+0x20/0x30 Mar 5 15:29:11 dracofiler kernel: [<ffffffff810ba311>] __queue_work+0xb1/0x450 Mar 5 15:29:11 dracofiler kernel: [<ffffffff814cc89e>] ? add_timer_randomness+0xde/0x100 Mar 5 15:29:11 dracofiler kernel: [<ffffffff810bac87>] queue_work_on+0x27/0x40 Mar 5 15:29:11 dracofiler kernel: [<ffffffff814cc751>] credit_entropy_bits+0x2e1/0x350 Mar 5 15:29:11 dracofiler kernel: [<ffffffff814cbb00>] ? mix_pool_bytes+0x50/0xb0 Mar 5 15:29:11 dracofiler kernel: [<ffffffff814cc89e>] add_timer_randomness+0xde/0x100 Mar 5 15:29:11 dracofiler kernel: [<ffffffff814cc8f6>] add_disk_randomness+0x36/0xa0 Mar 5 15:29:11 dracofiler kernel: [<ffffffff81532cf8>] scsi_end_request+0x148/0x1d0 Mar 5 15:29:11 dracofiler kernel: [<ffffffff815351c4>] scsi_io_completion+0xc4/0x650 Mar 5 15:29:11 dracofiler kernel: [<ffffffff8152c0bf>] scsi_finish_command+0xcf/0x120 Mar 5 15:29:11 dracofiler kernel: [<ffffffff81534a65>] scsi_softirq_done+0x125/0x150 Mar 5 15:29:11 dracofiler kernel: [<ffffffff8139ae8c>] blk_done_softirq+0x8c/0xc0 Mar 5 15:29:11 dracofiler kernel: [<ffffffff810a7cab>] __do_softirq+0xfb/0x2d0 Mar 5 15:29:11 dracofiler kernel: [<ffffffff810a8082>] irq_exit+0x102/0x110 Mar 5 15:29:11 dracofiler kernel: [<ffffffff8104ed03>] smp_call_function_single_interrupt+0x33/0x40 Mar 5 15:29:11 dracofiler kernel: [<ffffffff817abadc>] call_function_single_interrupt+0x8c/0xa0 Mar 5 15:29:11 dracofiler kernel: <EOI> [<ffffffff8164471a>] ? cpuidle_enter_state+0x11a/0x2c0 Mar 5 15:29:11 dracofiler kernel: [<ffffffff816448f7>] cpuidle_enter+0x17/0x20 Mar 5 15:29:11 dracofiler kernel: [<ffffffff810e4f4a>] call_cpuidle+0x2a/0x40 Mar 5 15:29:11 dracofiler kernel: [<ffffffff810e5315>] cpu_startup_entry+0x295/0x350 Mar 5 15:29:11 dracofiler kernel: [<ffffffff8104f5ae>] start_secondary+0x15e/0x190 Mar 5 15:29:11 dracofiler kernel: INFO: rcu_sched detected stalls on CPUs/tasks: Mar 5 15:29:11 dracofiler kernel: #0114-...: (1 GPs behind) idle=705/140000000000002/0 softirq=71174/71175 fqs=69218 Mar 5 15:29:11 dracofiler kernel: #011(detected by 7, t=240017 jiffies, g=33204, c=33203, q=0) Mar 5 15:29:11 dracofiler kernel: Task dump for CPU 4: Mar 5 15:29:11 dracofiler kernel: kworker/4:1 R running task 0 2255 2 0x00000008 Mar 5 15:29:11 dracofiler kernel: 0000000000000010 0000000000010046 ffff88062fd6fe28 0000000000000018 Mar 5 15:29:11 dracofiler kernel: ffffffff810bb75f 0000000000000000 ffff88063fa165c0 ffff880631b09cb0 Mar 5 15:29:11 dracofiler kernel: 0000000000000008 ffff88063fa165e0 ffff880631b09c80 ffff88062fd6fec0 Mar 5 15:29:11 dracofiler kernel: Call Trace: Mar 5 15:29:11 dracofiler kernel: [<ffffffff810bb75f>] ? process_one_work+0x1df/0x430 Mar 5 15:29:11 dracofiler kernel: [<ffffffff810bb9fe>] ? worker_thread+0x4e/0x480 Mar 5 15:29:11 dracofiler kernel: [<ffffffff817a5cf7>] ? __schedule+0x3a7/0xa00 Mar 5 15:29:11 dracofiler kernel: [<ffffffff810bb9b0>] ? process_one_work+0x430/0x430 Mar 5 15:29:11 dracofiler kernel: [<ffffffff810bb9b0>] ? process_one_work+0x430/0x430 Mar 5 15:29:11 dracofiler kernel: [<ffffffff810c1388>] ? kthread+0xd8/0xf0 Mar 5 15:29:11 dracofiler kernel: [<ffffffff810c12b0>] ? kthread_worker_fn+0x170/0x170 Mar 5 15:29:11 dracofiler kernel: [<ffffffff817aa8cf>] ? ret_from_fork+0x3f/0x70 Mar 5 15:29:11 dracofiler kernel: [<ffffffff810c12b0>] ? kthread_worker_fn+0x170/0x170 Mar 5 15:29:55 dracofiler systemd: systemd-udevd.service: State 'stop-sigabrt' timed out. Terminating. Mar 5 15:32:11 dracofiler kernel: INFO: rcu_sched detected stalls on CPUs/tasks: Mar 5 15:32:11 dracofiler kernel: #0114-...: (1 GPs behind) idle=705/140000000000002/0 softirq=71174/71175 fqs=113526 Mar 5 15:32:11 dracofiler kernel: #011(detected by 3, t=420023 jiffies, g=33204, c=33203, q=0) Mar 5 15:32:11 dracofiler kernel: Task dump for CPU 4: Mar 5 15:32:11 dracofiler kernel: kworker/4:1 R running task 0 2255 2 0x00000008 Mar 5 15:32:11 dracofiler kernel: 0000000000000010 0000000000010046 ffff88062fd6fe28 0000000000000018 Mar 5 15:32:11 dracofiler kernel: ffffffff810bb75f 0000000000000000 ffff88063fa165c0 ffff880631b09cb0 Mar 5 15:32:11 dracofiler kernel: 0000000000000008 ffff88063fa165e0 ffff880631b09c80 ffff88062fd6fec0 Mar 5 15:32:11 dracofiler kernel: Call Trace: Mar 5 15:32:11 dracofiler kernel: [<ffffffff810bb75f>] ? process_one_work+0x1df/0x430 Mar 5 15:32:11 dracofiler kernel: [<ffffffff810bb9fe>] ? worker_thread+0x4e/0x480 Mar 5 15:32:11 dracofiler kernel: [<ffffffff817a5cf7>] ? __schedule+0x3a7/0xa00 Mar 5 15:32:11 dracofiler kernel: [<ffffffff810bb9b0>] ? process_one_work+0x430/0x430 Mar 5 15:32:11 dracofiler kernel: [<ffffffff810bb9b0>] ? process_one_work+0x430/0x430 Mar 5 15:32:11 dracofiler kernel: [<ffffffff810c1388>] ? kthread+0xd8/0xf0 Mar 5 15:32:11 dracofiler kernel: [<ffffffff810c12b0>] ? kthread_worker_fn+0x170/0x170 Mar 5 15:32:11 dracofiler kernel: [<ffffffff817aa8cf>] ? ret_from_fork+0x3f/0x70 Mar 5 15:32:11 dracofiler kernel: [<ffffffff810c12b0>] ? kthread_worker_fn+0x170/0x170 Mar 5 15:35:11 dracofiler kernel: INFO: rcu_sched detected stalls on CPUs/tasks: Mar 5 15:35:11 dracofiler kernel: #0114-...: (1 GPs behind) idle=705/140000000000002/0 softirq=71174/71175 fqs=149291 Mar 5 15:35:11 dracofiler kernel: #011(detected by 13, t=600187 jiffies, g=33204, c=33203, q=0) Mar 5 15:35:11 dracofiler kernel: Task dump for CPU 4: Mar 5 15:35:11 dracofiler kernel: kworker/4:1 R running task 0 2255 2 0x00000008 Mar 5 15:35:11 dracofiler kernel: 0000000000000010 0000000000010046 ffff88062fd6fe28 0000000000000018 Mar 5 15:35:11 dracofiler kernel: ffffffff810bb75f 0000000000000000 ffff88063fa165c0 ffff880631b09cb0 Mar 5 15:35:11 dracofiler kernel: 0000000000000008 ffff88063fa165e0 ffff880631b09c80 ffff88062fd6fec0 Mar 5 15:35:11 dracofiler kernel: Call Trace: Mar 5 15:35:11 dracofiler kernel: [<ffffffff810bb75f>] ? process_one_work+0x1df/0x430 Mar 5 15:35:11 dracofiler kernel: [<ffffffff810bb9fe>] ? worker_thread+0x4e/0x480 Mar 5 15:35:11 dracofiler kernel: [<ffffffff817a5cf7>] ? __schedule+0x3a7/0xa00 Mar 5 15:35:11 dracofiler kernel: [<ffffffff810bb9b0>] ? process_one_work+0x430/0x430 Mar 5 15:35:11 dracofiler kernel: [<ffffffff810bb9b0>] ? process_one_work+0x430/0x430 Mar 5 15:35:11 dracofiler kernel: [<ffffffff810c1388>] ? kthread+0xd8/0xf0 Mar 5 15:35:11 dracofiler kernel: [<ffffffff810c12b0>] ? kthread_worker_fn+0x170/0x170 Mar 5 15:35:11 dracofiler kernel: [<ffffffff817aa8cf>] ? ret_from_fork+0x3f/0x70 Mar 5 15:35:11 dracofiler kernel: [<ffffffff810c12b0>] ? kthread_worker_fn+0x170/0x170 Mar 5 15:38:11 dracofiler kernel: INFO: rcu_sched detected stalls on CPUs/tasks: Mar 5 15:38:11 dracofiler kernel: #0114-...: (1 GPs behind) idle=705/140000000000002/0 softirq=71174/71175 fqs=188508 Mar 5 15:38:11 dracofiler kernel: #011(detected by 3, t=780193 jiffies, g=33204, c=33203, q=0) Mar 5 15:38:11 dracofiler kernel: Task dump for CPU 4: Mar 5 15:38:11 dracofiler kernel: kworker/4:1 R running task 0 2255 2 0x00000008 Mar 5 15:38:11 dracofiler kernel: 0000000000000010 0000000000010046 ffff88062fd6fe28 0000000000000018 Mar 5 15:38:11 dracofiler kernel: ffffffff810bb75f 0000000000000000 ffff88063fa165c0 ffff880631b09cb0 Mar 5 15:38:11 dracofiler kernel: 0000000000000008 ffff88063fa165e0 ffff880631b09c80 ffff88062fd6fec0 Mar 5 15:38:11 dracofiler kernel: Call Trace: Mar 5 15:38:11 dracofiler kernel: [<ffffffff810bb75f>] ? process_one_work+0x1df/0x430 Mar 5 15:38:11 dracofiler kernel: [<ffffffff810bb9fe>] ? worker_thread+0x4e/0x480 Mar 5 15:38:11 dracofiler kernel: [<ffffffff817a5cf7>] ? __schedule+0x3a7/0xa00 Mar 5 15:38:11 dracofiler kernel: [<ffffffff810bb9b0>] ? process_one_work+0x430/0x430 Mar 5 15:38:11 dracofiler kernel: [<ffffffff810bb9b0>] ? process_one_work+0x430/0x430 Mar 5 15:38:11 dracofiler kernel: [<ffffffff810c1388>] ? kthread+0xd8/0xf0 Mar 5 15:38:11 dracofiler kernel: [<ffffffff810c12b0>] ? kthread_worker_fn+0x170/0x170 Mar 5 15:38:11 dracofiler kernel: [<ffffffff817aa8cf>] ? ret_from_fork+0x3f/0x70 Mar 5 15:38:11 dracofiler kernel: [<ffffffff810c12b0>] ? kthread_worker_fn+0x170/0x170