Re: 4.11.0-rc5-00011-g08e4e0d oops in mpt3sas driver

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 06/04/17 08:30, Brad Campbell wrote:
G'day All,

This is a vaguely current git head kernel compiled yesterday.

Oopsed and rebooted itself, and then oopsed and rebooted again. There
was no sign of a raid rebuild in the kernel logs, and it's a staging
machine so there is nothing running after a reboot that goes near these
disks. They should have been completely idle the second time around.

This box suffered from bad rcu stalls on 4.10.x stable kernels, so I
upgraded to git head. It's all new hardware (the CPU, Chipset and
board), so I expected some issues with the board, but the LSI cards have
been around for a while now.

Further investigation indicates it might be a deeper problem. This is the first oops captured and it has nothing to do with the mpt3 driver.

[49580.533852] BUG: unable to handle kernel paging request at ffffffff817cddfe
[49580.533875] IP: queued_spin_lock_slowpath+0xe7/0x170
[49580.533879] PGD 180a067
[49580.533879] PUD 180b063
[49580.533882] PMD 80000000016001e1
[49580.533885]
[49580.533890] Oops: 0003 [#1] SMP
[49580.533894] Modules linked in: it87(O) deflate zlib_deflate ctr des_generic cbc cmac sha1_generic md5 hmac af_key xfrm_algo nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace sunrpc bonding sha256_generic dm_crypt aesni_intel aes_x86_64 crypto_simd cryptd glue_helper hwmon_vid netconsole configfs vhost_net vhost kvm_amd kvm irqbypass usbhid usb_storage nouveau video drm_kms_helper cfbfillrect syscopyarea cfbimgblt sysfillrect sysimgblt fb_sys_fops cfbcopyarea ttm drm mxm_wmi xhci_pci i2c_piix4 xhci_hcd usbcore usb_common wmi acpi_cpufreq mpt3sas igb i2c_algo_bit raid_class scsi_transport_sas ahci libahci [49580.533929] CPU: 6 PID: 114 Comm: kswapd0 Tainted: G O 4.11.0-rc5-00011-g08e4e0d-dirty #39 [49580.533933] Hardware name: System manufacturer System Product Name/PRIME X370-PRO, BIOS 0515 03/30/2017
[49580.534045] task: ffff8807f9ad0000 task.stack: ffffc90000430000
[49580.534049] RIP: 0010:queued_spin_lock_slowpath+0xe7/0x170
[49580.534052] RSP: 0018:ffffc90000433a50 EFLAGS: 00010082
[49580.534056] RAX: 00000000000034e1 RBX: 0000000000000292 RCX: 00000000001c0000 [49580.534059] RDX: ffffffff817cddfe RSI: ffff88081ed99900 RDI: ffff8806ddb860e0 [49580.534063] RBP: ffff8806ddb860e0 R08: 0000000000000101 R09: dead000000000200 [49580.534119] R10: ffffea001c000700 R11: ffff880006b457b9 R12: ffff8806ddb860c8 [49580.534122] R13: 0000000000000001 R14: ffffc90000433b40 R15: ffff8806ddb860c8 [49580.534179] FS: 0000000000000000(0000) GS:ffff88081ed80000(0000) knlGS:0000000000000000
[49580.534183] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[49580.534186] CR2: ffffffff817cddfe CR3: 0000000001809000 CR4: 00000000003406e0
[49580.534190] Call Trace:
[49580.534247]  ? _raw_spin_lock_irqsave+0x1f/0x30
[49580.534253]  ? __remove_mapping+0x65/0x1b0
[49580.534258]  ? page_mkclean_one+0x100/0x100
[49580.534313]  ? page_get_anon_vma+0xa0/0xa0
[49580.534317]  ? shrink_page_list+0x6aa/0xda0
[49580.534321]  ? shrink_inactive_list+0x1f6/0x4b0
[49580.534325]  ? es_reclaim_extents+0x55/0xe0
[49580.534328]  ? inactive_list_is_low.isra.70+0x10e/0x1c0
[49580.534332]  ? shrink_node_memcg.isra.75+0x58c/0x6b0
[49580.534531]  ? shrink_node+0x4a/0x190
[49580.534705]  ? kswapd+0x2b7/0x5d0
[49580.535076]  ? kthread+0xf1/0x130
[49580.535477]  ? shrink_node+0x190/0x190
[49580.535869]  ? __kthread_init_worker+0xa0/0xa0
[49580.536257]  ? ret_from_fork+0x23/0x30
[49580.536666] Code: 47 02 c1 e0 10 0f 84 93 00 00 00 48 89 c2 c1 e8 12 48 c1 ea 0c ff c8 83 e2 30 48 98 48 81 c2 00 99 01 00 48 03 14 c5 20 54 77 81 <48> 89 32 8b 46 08 85 c0 75 09 f3 90 8b 46 08 85 c0 74 f7 4c 8b [49580.537489] RIP: queued_spin_lock_slowpath+0xe7/0x170 RSP: ffffc90000433a50
[49580.537904] CR2: ffffffff817cddfe
[49580.540107] ---[ end trace f58d3bdd0830f2bf ]---
[49580.540642] Kernel panic - not syncing: Fatal exception
[49580.541212] Kernel Offset: disabled
[49580.541493] Rebooting in 10 seconds..
[49590.501026] ACPI MEMORY or I/O RESET_REG.


This box survives days of memtest, but I'm not above suspecting the underlying hardware if it points to that.




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux