Re: megaraid_sas problem for scsi_add_host() fail

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





Hi Sumit,

It is megaraid_sas driver bug. Driver does not freeup resources properly, when
scsi_add_host() fails. Please try attached patch.

Yeah, that looks to work. The driver gracefully failed to bind.

However we might have lots of memory leaks:


root@(none)$ echo scan > /sys/kernel/debug/kmemleak
root@(none)$ [ 140.585484] kmemleak: 259 new suspected memory leaks (see /sys/kernel/debug/kmemleak) [ 140.585484] kmemleak: 259 new suspected memory leaks (see /sys/kernel/debug/kmemleak)

root@(none)$
root@(none)$ more /sys/kernel/debug/kmemleak
unreferenced object 0xffff0026b9184c00 (size 512):
  comm "kworker/0:0", pid 5, jiffies 4294903201 (age 95.768s)
  hex dump (first 32 bytes):
    60 00 00 00 61 00 00 00 62 00 00 00 63 00 00 00  `...a...b...c...
    64 00 00 00 65 00 00 00 66 00 00 00 67 00 00 00  d...e...f...g...
  backtrace:
    [<(____ptrval____)>] slab_post_alloc_hook+0x6c/0xa0
    [<(____ptrval____)>] __kmalloc+0x174/0x280
    [<(____ptrval____)>] megasas_probe_one+0x798/0x2878
    [<(____ptrval____)>] local_pci_probe+0x74/0xf0
    [<(____ptrval____)>] work_for_cpu_fn+0x2c/0x48
    [<(____ptrval____)>] process_one_work+0x488/0xc08
    [<(____ptrval____)>] worker_thread+0x330/0x5d0
    [<(____ptrval____)>] kthread+0x1c8/0x1d0
    [<(____ptrval____)>] ret_from_fork+0x10/0x18
unreferenced object 0xffff0026b922c000 (size 4096):
  comm "kworker/0:0", pid 5, jiffies 4294903201 (age 95.768s)
  hex dump (first 32 bytes):
    00 00 21 b7 26 00 ff ff 00 00 9f ff 00 00 00 00  ..!.&...........
    00 10 22 10 00 a0 ff ff 00 00 00 00 00 00 00 00  ..".............
  backtrace:
    [<(____ptrval____)>] slab_post_alloc_hook+0x6c/0xa0
    [<(____ptrval____)>] kmem_cache_alloc_trace+0x140/0x228
    [<(____ptrval____)>] megasas_alloc_fusion_context+0x30/0x1b0
    [<(____ptrval____)>] megasas_probe_one+0x7d8/0x2878
    [<(____ptrval____)>] local_pci_probe+0x74/0xf0
    [<(____ptrval____)>] work_for_cpu_fn+0x2c/0x48
    [<(____ptrval____)>] process_one_work+0x488/0xc08
    [<(____ptrval____)>] worker_thread+0x330/0x5d0
    [<(____ptrval____)>] kthread+0x1c8/0x1d0
    [<(____ptrval____)>] ret_from_fork+0x10/0x18
unreferenced object 0xffff0026b7013000 (size 2048):
  comm "kworker/0:0", pid 5, jiffies 4294903512 (age 94.540s)
  hex dump (first 32 bytes):
    00 58 18 b9 26 00 ff ff 00 5c 18 b9 26 00 ff ff  .X..&....\..&...
    00 60 18 b9 26 00 ff ff 00 64 18 b9 26 00 ff ff  .`..&....d..&...
  backtrace:
    [<(____ptrval____)>] slab_post_alloc_hook+0x6c/0xa0
    [<(____ptrval____)>] kmem_cache_alloc_trace+0x140/0x228
root@(none)$


Thanks,
John


Thanks,
Sumit

[   62.516871] megasas: 07.713.01.00-rc1
[   62.526189] megaraid_sas 0000:08:00.0: Adding to iommu group 1
[   62.571790] megaraid_sas 0000:08:00.0: BAR:0x0  BAR's
base_addr(phys):0x0000080010000000  mapped virt_addr:0x(____ptrval____)
[   62.571802] megaraid_sas 0000:08:00.0: FW now in Ready state
[   62.583811] megaraid_sas 0000:08:00.0: 63 bit DMA mask and 63 bit
consistent mask
[   62.602143] megaraid_sas 0000:08:00.0: firmware supports msix : (128)
[   62.780250] megaraid_sas 0000:08:00.0: requested/available msix 128/128
[   62.794292] megaraid_sas 0000:08:00.0: current msix/online cpus :
(128/128)
[   62.809011] megaraid_sas 0000:08:00.0: RDPQ mode : (enabled)
[   62.820968] megaraid_sas 0000:08:00.0: Current firmware supports
maximum commands: 4077 LDIO threshold: 0
[   62.937043] megaraid_sas 0000:08:00.0: Configured max firmware
commands: 4076
[   63.509185] megaraid_sas 0000:08:00.0: Performance mode :Latency
[   63.521906] megaraid_sas 0000:08:00.0: FW supports sync cache : Yes
[   63.535148] megaraid_sas 0000:08:00.0: megasas_disable_intr_fusion is
called outbound_intr_mask:0x40000009
[   63.610607] megaraid_sas 0000:08:00.0: FW provided supportMaxExtLDs:
1 max_lds: 64
[   63.626618] megaraid_sas 0000:08:00.0: controller type : MR(2048MB)
[   63.639870] megaraid_sas 0000:08:00.0: Online Controller Reset(OCR) :
Enabled
[   63.654945] megaraid_sas 0000:08:00.0: Secure JBOD support : Yes
[   63.667661] megaraid_sas 0000:08:00.0: NVMe passthru support : Yes
[   63.667672] megaraid_sas 0000:08:00.0: FW provided TM TaskAbort/Reset
timeout : 6 secs/60 secs
[   63.698922] megaraid_sas 0000:08:00.0: JBOD sequence map support : Yes
[   63.712715] megaraid_sas 0000:08:00.0: PCI Lane Margining support : No
[   63.754764] megaraid_sas 0000:08:00.0: NVME page size : (4096)
[   63.787258] megaraid_sas 0000:08:00.0: megasas_enable_intr_fusion is
called outbound_intr_mask:0x40000000
[   63.807485] megaraid_sas 0000:08:00.0: INIT adapter done
[   63.822235] megaraid_sas 0000:08:00.0: pci id :
(0x1000)/(0x0016)/(0x19e5)/(0xd215)
[   63.838652] megaraid_sas 0000:08:00.0: unevenspan support : no
[   63.850980] megaraid_sas 0000:08:00.0: firmware crash dump : no
[   63.863499] megaraid_sas 0000:08:00.0: JBOD sequence map : enabled
[   63.877352] scsi host0: Avago SAS based MegaRAID driver
[   63.890398] megaraid_sas 0000:08:00.0: Failed to add host from
megasas_io_attach 6802
[   63.906999] megaraid_sas 0000:08:00.0: megasas_disable_intr_fusion is
called outbound_intr_mask:0x40000009
[   64.591755] nvme 0000:81:00.0: Adding to iommu group 2
[   64.636476] nvme nvme0: pci function 0000:81:00.0
[   64.669635] libphy: Fixed MDIO Bus: probed
[   64.680255] tun: Universal TUN/TAP device driver, 1.6
[   64.694422] thunder_xcv, ver 1.0
[   64.702042] thunder_bgx, ver 1.0
[   64.709277] nicpf, ver 1.0
[   64.718144] e1000e: Intel(R) PRO/1000 Network Driver - 3.2.6-k
[   64.730402] e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
[   64.743337] igb: Intel(R) Gigabit Ethernet Network Driver - version
5.6.0-k
[   64.754981] nvme nvme0: Removing after probe failure status: -12
[   64.757953] igb: Copyright (c) 2007-2014 Intel Corporation.
[   64.782805] igbvf: Intel(R) Gigabit Virtual Function Network Driver -
version 2.4.0-k
[   64.799423] igbvf: Copyright (c) 2009 - 2012 Intel Corporation.
[   64.813848] sky2: driver version 1.30
[   64.825564] VFIO - User Level meta-driver version: 0.3
[   64.848089] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
[   64.862029] ehci-pci: EHCI PCI platform driver
[   64.873445] ehci-pci 0000:7a:01.0: Adding to iommu group 3
[   64.886700]
==================================================================
[   64.901999] BUG: KASAN: slab-out-of-bounds in
run_timer_softirq+0x6f4/0xae0
[   64.916663] Write of size 8 at addr ffff0026b931aae0 by task swapper/0/0

[   64.933914] CPU: 0 PID: 0 Comm: swapper/0 Not tainted
5.6.0-rc3-00005-g17ceebe3a05c-dirty #1775
[   64.952240] Hardware name: Huawei TaiShan 200 (Model 2280)/BC82AMDD,
BIOS 2280-V2 CS V3.B160.01 02/24/2020
[   64.972575] Call trace:
[   64.977729]  dump_backtrace+0x0/0x298
[   64.985439]  show_stack+0x14/0x20
[   64.992418]  dump_stack+0x118/0x190
[   64.999762]  print_address_description.isra.9+0x6c/0x3b8
[   65.010953]  __kasan_report+0x134/0x23c
[   65.019029]  kasan_report+0xc/0x18
[   65.026188]  __asan_store8+0x94/0xb8
[   65.033720]  run_timer_softirq+0x6f4/0xae0
[   65.042343]  efi_header_end+0x16c/0x840
[   65.050420]  irq_exit+0x19c/0x1a8
[   65.057396]  __handle_domain_irq+0x7c/0xe0
[   65.066022]  gic_handle_irq+0x64/0x168
[   65.073917]  el1_irq+0xbc/0x180
[   65.080528]  arch_cpu_idle+0x3c/0x320
[   65.088239]  default_idle_call+0x28/0x4c
[   65.096502]  do_idle+0x278/0x348
[   65.103295]  cpu_startup_entry+0x24/0x40
[   65.111554]  rest_init+0x1c4/0x298
[   65.118718]  arch_call_rest_init+0xc/0x14
[   65.127159]  start_kernel+0x848/0x888

[   65.138006] Allocated by task 0:
[   65.144802] (stack is not available)

[   65.155465] Freed by task 0:
[   65.161530] (stack is not available)

[   65.172193] The buggy address belongs to the object at ffff0026b931aa00
   which belongs to the cache pool_workqueue of size 256
[   65.199113] The buggy address is located 224 bytes inside of
   256-byte region [ffff0026b931aa00, ffff0026b931ab00)
[   65.223840] The buggy address belongs to the page:
[   65.233931] page:fffffe009ac4c600 refcount:1 mapcount:0
mapping:ffff0026dd81c880 index:0xffff0026b931fe00 compound_mapcount: 0
[   65.257923] flags: 0x6ffff00000010200(slab|head)
[   65.267649] raw: 6ffff00000010200 fffffe009b20b208 fffffe009ac07608
ffff0026dd81c880
[   65.283959] raw: ffff0026b931fe00 0000000000400002 00000001ffffffff
0000000000000000
[   65.300270] page dumped because: kasan: bad access detected

[   65.315139] Memory state around the buggy address:
[   65.325231]  ffff0026b931a980: fc fc fc fc fc fc fc fc fc fc fc fc fc
fc fc fc
[   65.340445]  ffff0026b931aa00: fc fc fc fc fc fc fc fc fc fc fc fc fc
fc fc fc
[   65.355660] >ffff0026b931aa80: fc fc fc fc fc fc fc fc fc fc fc fc fc
fc fc fc
[   65.370870]                                                        ^
[   65.384256]  ffff0026b931ab00: fc fc fc fc fc fc fc fc fc fc fc fc fc
fc fc fc
[   65.399467]  ffff0026b931ab80: fc fc fc fc fc fc fc fc fc fc fc fc fc
fc fc fc
[   65.414675]
==================================================================
[   65.429885] Disabling lock debugging due to kernel taint
[   65.441431] Unable to handle kernel paging request at virtual address
ffffa0001013c0b0
[   65.441695] ehci-pci 0000:7a:01.0: EHCI Host Controller
[   65.458088] Mem abort info:
[   65.469183] ehci-pci 0000:7a:01.0: new USB bus registered, assigned
bus number 1
[   65.474927]   ESR = 0x96000007
[   65.491201] ehci-pci 0000:7a:01.0: irq 65, io mem 0x20c101000
[   65.496913]   EC = 0x25: DABT (current EL), IL = 32 bits
[   65.496918]   SET = 0, FnV = 0
[   65.496922]   EA = 0, S1PTW = 0
[   65.522586] ehci-pci 0000:7a:01.0: USB 0.0 started, EHCI 1.00
[   65.526575] Data abort info:
[   65.526580]   ISV = 0, ISS = 0x00000007
[   65.535948] hub 1-0:1.0: USB hub found
[   65.545245]   CM = 0, WnR = 0
[   65.545251] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000052530000
[   65.545256] [ffffa0001013c0b0] pgd=00002027fffff003,
pud=00002027ffffe003, pmd=00000026dda5b003, pte=0000000000000000
[   65.551519] hub 1-0:1.0: 2 ports detected
[   65.559375] Internal error: Oops: 96000007 [#1] PREEMPT SMP
[   65.559379] Modules linked in:
[   65.569534] ehci-platform: EHCI generic platform driver
[   65.573475] CPU: 34 PID: 8 Comm: kworker/u256:0 Tainted: G    B
        5.6.0-rc3-00005-g17ceebe3a05c-dirty #1775
[   65.573477] Hardware name: Huawei TaiShan 200 (Model 2280)/BC82AMDD,
BIOS 2280-V2 CS V3.B160.01 02/24/2020
[   65.573487] Workqueue: poll_megasas0_status megasas_fault_detect_work
[   65.573492] pstate: 80c00009 (Nzcv daif +PAN +UAO)
[   65.588048] ehci-orion: EHCI orion driver
[   65.609756] pc : megasas_readl+0x60/0x80
[   65.609759] lr : megasas_readl+0x1c/0x80
[   65.609761] sp : ffff0026d97bfc00
[   65.609763] x29: ffff0026d97bfc00 x28: ffff0026d97a9890
[   65.609767] x27: ffff0026d97a0618 x26: ffff0026d97a9880
[   65.609771] x25: ffff0026d9758808 x24: ffff0026b931aa28
[   65.609775] x23: ffff0026b931aa98 x22: ffffa0002931e000
[   65.609779] x21: ffff0026dd898800 x20: ffff0026b931dcd8
[   65.618543] ehci-exynos: EHCI Exynos driver
[   65.629840] x19: ffffa0001013c0b0 x18: 0000000000000000
[   65.629843] x17: 0000000000001d50 x16: ffffffffffffe240
[   65.629847] x15: 00000000000013a8 x14: 0000000000000000
[   65.629850] x13: 00000000000013a0 x12: 1fffe004db2f7f7c
[   65.629854] x11: ffff8004db2f7f78 x10: dfffa00000000000
[   65.629857] x9 : ffffa00028f679e8 x8 : ffffa0002a483a48
[   65.629861] x7 : ffffa00026d5ed94 x6 : 0000000000000000
[   65.629864] x5 : ffffa0002a483a48 x4 : 0000000000000000
[   65.629868] x3 : ffffa000279df03c x2 : 0000000000000000
[   65.636662] ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
[   65.647207] x1 : ef244e124d671400 x0 : 0000000000000004
[   65.647210] Call trace:
[   65.647214]  megasas_readl+0x60/0x80
[   65.647218]  megasas_read_fw_status_reg_fusion+0x2c/0x38
[   65.647221]  megasas_fault_detect_work+0x44/0x520
[   65.647226]  process_one_work+0x488/0xc08
[   65.647228]  worker_thread+0x68/0x5d0
[   65.647233]  kthread+0x1c8/0x1d0
[   65.669535] ohci-pci: OHCI PCI platform driver
[   65.689683]  ret_from_fork+0x10/0x18
[   65.689689] Code: 54ffff09 a94153f3 a8c27bfd d65f03c0 (b9400260)
[   65.689695] ---[ end trace 3632c7efc4f2d69c ]---


That's 5.6-rc3 .

Please have a look,

John







[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]

  Powered by Linux