Interesting 'list _add double add' with nvme drives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I'm seeing a list error when we take away, then add back a bunch of nvme 
drives. It's not very easy to repro, and the one surviving log is pasted 
below.

Alex


[  111.808900] pciehp 0000:b0:04.0:pcie204: Slot(178): Link Down
[  117.496424] pciehp 0000:b0:04.0:pcie204: Slot(178): Link Up
[  117.508144] pciehp 0000:3c:06.0:pcie204: Slot(180): Link Up
[  117.521525] pciehp 0000:b0:05.0:pcie204: Slot(179): Link Up
[  117.764856] pci 0000:3f:00.0: [144d:a822] type 00 class 0x010802
[  117.764897] pci 0000:3f:00.0: reg 0x10: [mem 0x00000000-0x00003fff 64bit]
[  117.764948] pci 0000:3f:00.0: Max Payload Size set to 256 (was 128,
max 256)
[  117.765671] pcieport 0000:3c:06.0: bridge window [io  0x1000-0x0fff]
to [bus 3f] add_size 1000
[  117.765679] pcieport 0000:3c:06.0: BAR 13: no space for [io  size 0x1000]
[  117.765682] pcieport 0000:3c:06.0: BAR 13: failed to assign [io  size
0x1000]
[  117.765686] pcieport 0000:3c:06.0: BAR 13: no space for [io  size 0x1000]
[  117.765689] pcieport 0000:3c:06.0: BAR 13: failed to assign [io  size
0x1000]
[  117.765696] pci 0000:3f:00.0: BAR 0: assigned [mem
0xab500000-0xab503fff 64bit]
[  117.765710] pcieport 0000:3c:06.0: PCI bridge to [bus 3f]
[  117.765717] pcieport 0000:3c:06.0:   bridge window [mem
0xab500000-0xab5fffff]
[  117.765723] pcieport 0000:3c:06.0:   bridge window [mem
0x382000400000-0x3820005fffff 64bit pref]
[  117.766944] nvme nvme2: pci function 0000:3f:00.0
[  117.767060] nvme 0000:3f:00.0: enabling device (0000 -> 0002)
[  117.780851] pci 0000:b2:00.0: [144d:a822] type 00 class 0x010802
[  117.780889] pci 0000:b2:00.0: reg 0x10: [mem 0x00000000-0x00003fff 64bit]
[  117.780938] pci 0000:b2:00.0: Max Payload Size set to 256 (was 128,
max 256)
[  117.781576] pcieport 0000:b0:05.0: bridge window [io  0x1000-0x0fff]
to [bus b2] add_size 1000
[  117.781583] pcieport 0000:b0:05.0: BAR 13: no space for [io  size 0x1000]
[  117.781586] pcieport 0000:b0:05.0: BAR 13: failed to assign [io  size
0x1000]
[  117.781590] pcieport 0000:b0:05.0: BAR 13: no space for [io  size 0x1000]
[  117.781593] pcieport 0000:b0:05.0: BAR 13: failed to assign [io  size
0x1000]
[  117.781600] pci 0000:b2:00.0: BAR 0: assigned [mem
0xe1400000-0xe1403fff 64bit]
[  117.781613] pcieport 0000:b0:05.0: PCI bridge to [bus b2]
[  117.781620] pcieport 0000:b0:05.0:   bridge window [mem
0xe1400000-0xe14fffff]
[  117.781626] pcieport 0000:b0:05.0:   bridge window [mem
0x386000200000-0x3860003fffff 64bit pref]
[  117.782498] nvme nvme3: pci function 0000:b2:00.0
[  117.782530] nvme 0000:b2:00.0: enabling device (0000 -> 0002)
[  117.800846] pci 0000:b1:00.0: [8086:0a55] type 00 class 0x010802
[  117.800883] pci 0000:b1:00.0: reg 0x10: [mem 0x00000000-0x00003fff 64bit]
[  117.800927] pci 0000:b1:00.0: Max Payload Size set to 256 (was 128,
max 512)
[  117.800932] pci 0000:b1:00.0: enabling Extended Tags
[  117.801564] pcieport 0000:b0:04.0: bridge window [io  0x1000-0x0fff]
to [bus b1] add_size 1000
[  117.801571] pcieport 0000:b0:04.0: BAR 13: no space for [io  size 0x1000]
[  117.801574] pcieport 0000:b0:04.0: BAR 13: failed to assign [io  size
0x1000]
[  117.801577] pcieport 0000:b0:04.0: BAR 13: no space for [io  size 0x1000]
[  117.801580] pcieport 0000:b0:04.0: BAR 13: failed to assign [io  size
0x1000]
[  117.801587] pci 0000:b1:00.0: BAR 0: assigned [mem
0xe1500000-0xe1503fff 64bit]
[  117.801599] pcieport 0000:b0:04.0: PCI bridge to [bus b1]
[  117.801606] pcieport 0000:b0:04.0:   bridge window [mem
0xe1500000-0xe15fffff]
[  117.801612] pcieport 0000:b0:04.0:   bridge window [mem
0x386000000000-0x3860001fffff 64bit pref]
[  117.802362] nvme nvme4: pci function 0000:b1:00.0
[  117.802390] nvme 0000:b1:00.0: enabling device (0000 -> 0002)
[  117.896666] pciehp 0000:b0:04.0:pcie204: Slot(178): Card not present
[  117.896844] pciehp 0000:b0:05.0:pcie204: Slot(179): Card not present
[  117.896944] pciehp 0000:3c:06.0:pcie204: Slot(180): Card not present
[  120.225239] nvme nvme2: Shutdown timeout set to 10 seconds
[  120.225299] nvme nvme3: Shutdown timeout set to 10 seconds
[  121.336917] nvme nvme4: failed to mark controller CONNECTING
[  121.336922] nvme nvme4: Removing after probe failure status: 0
[  121.353534] pciehp 0000:b0:04.0:pcie204: Slot(178): Card present
[  121.353538] pciehp 0000:b0:04.0:pcie204: Slot(178): Link Up
[  121.368290] list_add double add: new=ffff956b64c0c658,
prev=ffff956b64c0c658, next=ffff956f6f2ddfe0.
[  121.368310] ------------[ cut here ]------------
[  121.368312] kernel BUG at lib/list_debug.c:31!
[  121.372769] invalid opcode: 0000 [#1] SMP PTI
[  121.377132] CPU: 7 PID: 628 Comm: irq/45-pciehp Not tainted 5.0.0 #216
[  121.383662] Hardware name: Dell Inc. PowerEdge R740xd/07X9K0, BIOS
1.4.4 [Recoverable-Unmask] 03/09/2018
[  121.393137] RIP: 0010:__list_add_valid+0x41/0x50
[  121.397751] Code: 85 94 00 00 00 48 39 c7 74 0b 48 39 d7 74 06 b8 01
00 00 00 c3 48 89 f2 4c 89 c1 48 89 fe 48 c7 c7 50 25 12 af e8 1d 2e c9
ff <0f> 0b 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 8b 07 48 8b 57 08
[  121.416495] RSP: 0018:ffffbe9708f9bbf0 EFLAGS: 00010046
[  121.421723] RAX: 0000000000000058 RBX: ffff956f6f2ddfe0 RCX:
0000000000000000
[  121.428854] RDX: 0000000000000000 RSI: ffff956f6f2d6908 RDI:
ffff956f6f2d6908
[  121.435986] RBP: ffff956b64c0c600 R08: 000000000000087c R09:
0000000000000003
[  121.443118] R10: 0000000000000000 R11: 0000000000000001 R12:
ffff956b64c0c658
[  121.450250] R13: 0000000000000282 R14: ffff956b64c0c658 R15:
0000000000000000
[  121.457383] FS:  0000000000000000(0000) GS:ffff956f6f2c0000(0000)
knlGS:0000000000000000
[  121.465468] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  121.471212] CR2: 00007f77870ba068 CR3: 000000083389e004 CR4:
00000000007606e0
[  121.478345] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[  121.485477] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[  121.492610] PKRU: 55555554
[  121.495320] Call Trace:
[  121.497780]  __blk_complete_request+0x74/0x110
[  121.502222]  blk_mq_complete_request+0xb6/0x100
[  121.506759]  nvme_cancel_request+0x27/0x70 [nvme_core]
[  121.511896]  blk_mq_tagset_busy_iter+0x203/0x270
[  121.516510]  ? nvme_complete_rq+0x210/0x210 [nvme_core]
[  121.521736]  ? nvme_complete_rq+0x210/0x210 [nvme_core]
[  121.526964]  nvme_dev_disable+0xfb/0x1d0 [nvme]
[  121.531493]  nvme_remove+0x12c/0x170 [nvme]
[  121.535681]  pci_device_remove+0x3b/0xc0
[  121.539608]  device_release_driver_internal+0x183/0x240
[  121.544834]  pci_stop_bus_device+0x69/0x90
[  121.548931]  pci_stop_and_remove_bus_device+0xe/0x20
[  121.553899]  pciehp_unconfigure_device+0x84/0x140
[  121.558608]  pciehp_disable_slot+0x67/0x110
[  121.562796]  pciehp_handle_presence_or_link_change+0x25f/0x400
[  121.568630]  ? __synchronize_hardirq+0x43/0x50
[  121.573074]  pciehp_ist+0x1bb/0x1c0
[  121.576567]  ? irq_finalize_oneshot.part.43+0xe0/0xe0
[  121.581617]  irq_thread_fn+0x1f/0x60
[  121.585198]  irq_thread+0xe7/0x170
[  121.588602]  ? irq_forced_thread_fn+0x70/0x70
[  121.592963]  ? irq_thread_check_affinity+0x90/0x90
[  121.597754]  kthread+0x112/0x130
[  121.600987]  ? kthread_create_on_node+0x60/0x60
[  121.605521]  ret_from_fork+0x35/0x40
[  121.609098] Modules linked in: xt_CHECKSUM ipt_MASQUERADE tun bridge
stp llc ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack
ebtable_nat ip6table_nat nf_nat_ipv6 ip6table_mangle ip6table_raw
ip6table_security iptable_nat nf_nat_ipv4 nf_nat devlink iptable_mangle
iptable_raw iptable_security nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4
ip_set nfnetlink ebtable_filter ebtables ip6table_filter ip6_tables
sunrpc f2fs vfat fat intel_rapl skx_edac nfit x86_pkg_temp_thermal
intel_powerclamp coretemp kvm_intel kvm ses enclosure irqbypass
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel intel_cstate joydev
iTCO_wdt iTCO_vendor_support ipmi_ssif dcdbas intel_uncore
intel_rapl_perf mei_me pcspkr i2c_i801 mei ioatdma lpc_ich ipmi_si
ipmi_devintf ipmi_msghandler acpi_power_meter raid1 dm_raid raid456
libcrc32c async_raid6_recov async_memcpy async_pq async_xor xor async_tx
raid6_pq mgag200 drm_kms_helper ttm drm mpt3sas igb nvme crc32c_intel
raid_class nvme_core uas scsi_transport_sas usb_storage
[  121.609134]  dca i2c_algo_bit
[  121.699398] ---[ end trace 8704317f268b2403 ]---
[  121.743228] RIP: 0010:__list_add_valid+0x41/0x50
[  121.747858] Code: 85 94 00 00 00 48 39 c7 74 0b 48 39 d7 74 06 b8 01
00 00 00 c3 48 89 f2 4c 89 c1 48 89 fe 48 c7 c7 50 25 12 af e8 1d 2e c9
ff <0f> 0b 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 8b 07 48 8b 57 08
[  121.766601] RSP: 0018:ffffbe9708f9bbf0 EFLAGS: 00010046
[  121.771827] RAX: 0000000000000058 RBX: ffff956f6f2ddfe0 RCX:
0000000000000000
[  121.778958] RDX: 0000000000000000 RSI: ffff956f6f2d6908 RDI:
ffff956f6f2d6908
[  121.786089] RBP: ffff956b64c0c600 R08: 000000000000087c R09:
0000000000000003
[  121.793222] R10: 0000000000000000 R11: 0000000000000001 R12:
ffff956b64c0c658
[  121.800353] R13: 0000000000000282 R14: ffff956b64c0c658 R15:
0000000000000000
[  121.807487] FS:  0000000000000000(0000) GS:ffff956f6f2c0000(0000)
knlGS:0000000000000000
[  121.815573] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  121.821319] CR2: 00007f77870ba068 CR3: 000000083389e004 CR4:
00000000007606e0
[  121.828448] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[  121.835583] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[  121.842713] PKRU: 55555554
[  121.845506] nvme nvme3: IO queues not created
[  121.849891] nvme nvme3: failed to mark controller state 2
[  121.855298] nvme nvme3: Removing after probe failure status: 0
[  124.879721] md/raid1:md126: active with 1 out of 2 mirrors
[  124.885245] md126: failed to create bitmap (-5)




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux