Seeing this on a RHEL kernel with upstream backports wondering if this was ever fixed

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello

https://www.spinics.net/lists/linux-rdma/msg51334.html

A rhel 7.5 with backports from upstream is hitting this.
Chuck Reported it and Sagi and Max responded but its not clear if we
ever fixed this.

In this case we land up in a panic, noty just messaging, although the
messages logged for a long time over and over until we finally
panicked.

crash> log | grep "memreg failure: memor" | wc -l
2414

crash> log
[1635578.012721]  connection16:0: detected conn error (1011)
[1635587.050688] mlx5_0:dump_cqe:262:(pid 93128): dump error cqe
[1635587.089686] 00000000 00000000 00000000 00000000
[1635587.123989] 00000000 00000000 00000000 00000000
[1635587.157494] 00000000 00000000 00000000 00000000
[1635587.190968] 00000000 08007806 250002ad ba6115d3

[1635587.224331] iser: iser_err_comp: memreg failure: memory management
operation error (6) vend_err 78
[1635587.278876]  connection15:0: detected conn error (1011)
[1635590.986286] mlx5_1:dump_cqe:262:(pid 0): dump error cqe
[1635591.021891] 00000000 00000000 00000000 00000000
[1635591.053944] 00000000 00000000 00000000 00000000

[1657077.997960] BUG: unable to handle kernel NULL pointer dereference
at 0000000000000010
[1657077.997967] IP: [<ffffffffc08a541e>] iscsi_verify_itt+0x1e/0x110
[libiscsi]
[1657077.997970] PGD 80000098de387067 PUD b8d9ffa067 PMD 0 
[1657077.997971] Oops: 0000 [#1] SMP 
[1657077.998009] Modules linked in: oracleasm(O) nfsv3 rpcsec_gss_krb5
nfsv4 dns_resolver nfs fscache dm_round_robin bonding rpcrdma ib_isert
iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt
target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm
ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm mlx5_ib ib_core vfat fat
xfs sb_edac edac_core intel_powerclamp coretemp intel_rapl iosf_mbi
kvm_intel kvm irqbypass iTCO_wdt crc32_pclmul ipmi_ssif
iTCO_vendor_support ghash_clmulni_intel aesni_intel lrw gf128mul
ipmi_si glue_helper ablk_helper cryptd sg hpwdt hpilo pcspkr
ipmi_devintf ioatdma dm_multipath i2c_i801 lpc_ich shpchp dca wmi
ipmi_msghandler pcc_cpufreq acpi_power_meter nfsd binfmt_misc
auth_rpcgss nfs_acl lockd grace sunrpc ip_tables ext4 mbcache jbd2
sd_mod crc_t10dif crct10dif_generic
[1657077.998020]  i2c_algo_bit drm_kms_helper syscopyarea sysfillrect
sysimgblt fb_sys_fops ttm bnx2x mlx5_core crct10dif_pclmul mdio tg3(OE)
devlink libcrc32c crct10dif_common drm hpsa(OE) ptp i2c_core
crc32c_intel scsi_transport_sas pps_core dm_mirror dm_region_hash
dm_log dm_mod
[1657077.998023] CPU: 20 PID: 41538 Comm: sh Tainted: G           OE  -
-----------   3.10.0-693.34.1.el7_bz1582551.x86_64 #1
[1657077.998024] Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380
Gen9, BIOS P89 05/21/2018
[1657077.998025] task: ffff88587ce38fd0 ti: ffff884dd0af0000 task.ti:
ffff884dd0af0000
[1657077.998029] RIP: 0010:[<ffffffffc08a541e>]  [<ffffffffc08a541e>]
iscsi_verify_itt+0x1e/0x110 [libiscsi]
[1657077.998030] RSP: 0000:ffff88beff403d78  EFLAGS: 00010286
[1657077.998031] RAX: 000000000000004c RBX: 00000000b0000036 RCX:
0000000000000002
[1657077.998032] RDX: 00000000000000cc RSI: 00000000b0000036 RDI:
0000000000000000
[1657077.998033] RBP: ffff88beff403da0 R08: 0000000040032a20 R09:
ffff8896e4eaf91c
[1657077.998034] R10: 0000000000000000 R11: 00007ffff7763ca0 R12:
0000000000000000
[1657077.998035] R13: ffff8896e4eaf9e4 R14: ffff8896e4eaf900 R15:
0000000000000000
[1657077.998036] FS:  00007ffff7fe6740(0000) GS:ffff88beff400000(0000)
knlGS:0000000000000000
[1657077.998038] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[1657077.998039] CR2: 0000000000000010 CR3: 000000ad92eba000 CR4:
00000000003607e0
[1657077.998040] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[1657077.998041] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[1657077.998042] Call Trace:
[1657077.998044]  <IRQ> 
[1657077.998046]  [<ffffffffc08a5527>] iscsi_itt_to_ctask+0x17/0x80
[libiscsi]
[1657077.998050]  [<ffffffffc05eefea>] iser_task_rsp+0xca/0x360
[ib_iser]
[1657077.998061]  [<ffffffffc0587fbb>] __ib_process_cq+0x6b/0xe0
[ib_core]
[1657077.998066]  [<ffffffffc0588122>] ib_poll_handler+0x22/0x80
[ib_core]
[1657077.998070]  [<ffffffff81358507>] irq_poll_softirq+0xc7/0x100
[1657077.998076]  [<ffffffff81095195>] __do_softirq+0xf5/0x280
[1657077.998081]  [<ffffffff816c4e8c>] call_softirq+0x1c/0x30
[1657077.998086]  [<ffffffff8102d435>] do_softirq+0x65/0xa0
[1657077.998088]  [<ffffffff81095515>] irq_exit+0x105/0x110
[1657077.998091]  [<ffffffff816c61d6>] do_IRQ+0x56/0xf0
[1657077.998098]  [<ffffffff816b837c>] common_interrupt+0x17c/0x17c
[1657077.998099]  <EOI> 
[1657077.998113] Code: ff ff ff eb a9 41 be 95 ff ff ff eb a1 0f 1f 44
00 00 55 48 89 e5 41 55 41 54 49 89 fc 53 89 f3 48 83 ec 10 c7 45 d8 00
00 00 00 <4c> 8b 6f 10 65 48 8b 04 25 28 00 00 00 48 89 45 e0 31 c0 83
fe 
[1657077.998116] RIP  [<ffffffffc08a541e>] iscsi_verify_itt+0x1e/0x110
[libiscsi]
[1657077.998116]  RSP <ffff88beff403d78>
[1657077.998117] CR2: 0000000000000010
crash> 

crash> bt
PID: 41538  TASK: ffff88587ce38fd0  CPU: 20  COMMAND: "sh"
 #0 [ffff88beff403a18] machine_kexec at ffffffff8105ddeb
 #1 [ffff88beff403a78] __crash_kexec at ffffffff81109902
 #2 [ffff88beff403b48] crash_kexec at ffffffff811099f0
 #3 [ffff88beff403b60] oops_end at ffffffff816b97a8
 #4 [ffff88beff403b88] no_context at ffffffff816a8c96
 #5 [ffff88beff403bd8] __bad_area_nosemaphore at ffffffff816a8d2c
 #6 [ffff88beff403c20] bad_area_nosemaphore at ffffffff816a8e96
 #7 [ffff88beff403c30] __do_page_fault at ffffffff816bc6be
 #8 [ffff88beff403c90] do_page_fault at ffffffff816bc865
 #9 [ffff88beff403cc0] page_fault at ffffffff816b8788
    [exception RIP: iscsi_verify_itt+30]
    RIP: ffffffffc08a541e  RSP: ffff88beff403d78  RFLAGS: 00010286
    RAX: 000000000000004c  RBX: 00000000b0000036  RCX: 0000000000000002
    RDX: 00000000000000cc  RSI: 00000000b0000036  RDI: 0000000000000000
    RBP: ffff88beff403da0   R8: 0000000040032a20   R9: ffff8896e4eaf91c
    R10: 0000000000000000  R11: 00007ffff7763ca0  R12: 0000000000000000
    R13: ffff8896e4eaf9e4  R14: ffff8896e4eaf900  R15: 0000000000000000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0000
#10 [ffff88beff403da8] iscsi_itt_to_ctask at ffffffffc08a5527
[libiscsi]
#11 [ffff88beff403dc8] iser_task_rsp at ffffffffc05eefea [ib_iser]
#12 [ffff88beff403e10] __ib_process_cq at ffffffffc0587fbb [ib_core]
#13 [ffff88beff403e50] ib_poll_handler at ffffffffc0588122 [ib_core]
#14 [ffff88beff403e80] irq_poll_softirq at ffffffff81358507
#15 [ffff88beff403eb8] __do_softirq at ffffffff81095195
#16 [ffff88beff403f28] call_softirq at ffffffff816c4e8c
#17 [ffff88beff403f40] do_softirq at ffffffff8102d435
#18 [ffff88beff403f60] irq_exit at ffffffff81095515
#19 [ffff88beff403f78] do_IRQ at ffffffff816c61d6
--- <IRQ stack> ---
#20 [ffff884dd0af3f58] ret_from_intr at ffffffff816b837c
    RIP: 000000000041b866  RSP: 00007fffffffea28  RFLAGS: 00000206
    RAX: 0000000000000000  RBX: 00007fffffffef53  RCX: 00000000006f1a70
    RDX: 00000000006f1a70  RSI: 00000000006f1a90  RDI: 0000000000000000
    RBP: 0000000000000002   R8: 0000000000000001   R9: 0000000000000020
    R10: 0000000000000003  R11: 00007ffff7763ca0  R12: ffff88beff4061e8
    R13: 00000000ffffffff  R14: 0000000000000000  R15: 0000000000000063
    ORIG_RAX: ffffffffffffffbb  CS: 0033  SS: 002b

crash> ps -p 41538
PID: 0      TASK: ffffffff81a0e480  CPU: 0   COMMAND: "swapper/0"
 PID: 1      TASK: ffff88012e4c8000  CPU: 7   COMMAND: "systemd"
  PID: 2345   TASK: ffff885ef5eb8fd0  CPU: 14  COMMAND: "zabbix_agentd"
   PID: 2349   TASK: ffff885efcbcaf70  CPU: 1   COMMAND:
"zabbix_agentd"
    PID: 41538  TASK: ffff88587ce38fd0  CPU: 20  COMMAND: "sh"
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux