Re: Seeing this on a RHEL kernel with upstream backports wondering if this was ever fixed

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2018-08-07 at 14:26 -0400, Laurence Oberman wrote:
> On Fri, 2018-07-27 at 09:21 -0400, Laurence Oberman wrote:
> > On Fri, 2018-07-27 at 08:05 -0400, Laurence Oberman wrote:
> > > On Thu, 2018-07-26 at 16:02 -0400, Laurence Oberman wrote:
> > > > On Thu, 2018-07-26 at 10:28 -0400, Don Dutile wrote:
> > > > > On 07/26/2018 08:48 AM, Laurence Oberman wrote:
> > > > > > Hello
> > > > > > 
> > > > > > https://www.spinics.net/lists/linux-rdma/msg51334.html
> > > > > > 
> > > > > > A rhel 7.5 with backports from upstream is hitting this.
> > > > > > Chuck Reported it and Sagi and Max responded but its not
> > > > > > clear
> > > > > > if
> > > > > > we
> > > > > > ever fixed this.
> > > > > > 
> > > > > 
> > > > > RHEL-7.5 data point:
> > > > > -- drivers/infiniband/* -r is backported to v4.14.
> > > > >     i.e., includes the patch(es) mentioned in the above
> > > > > thread.
> > > > > 
> > > > > Laurence:
> > > > > Please test with 7.6 kernel & report back.
> > > > > if that passes, RH can bisect the bug fix btwn v4.14 &
> > > > > v4.16(the
> > > > > 7.6
> > > > > update point for its rdma kernel core),
> > > > > and backport to 7.5-zstream.  note: you'll have to update
> > > > > rdma-
> > > > > core
> > > > > pkg to the 7.6 version as well.
> > > > > All functional & bug fix patches to mlx* (ib & enet) are in
> > > > > as
> > > > > well
> > > > > (same kernel references).
> > > > > 
> > > > > -dd
> > > > > 
> > > > > > In this case we land up in a panic, noty just messaging,
> > > > > > although
> > > > > > the
> > > > > > messages logged for a long time over and over until we
> > > > > > finally
> > > > > > panicked.
> > > > > > 
> > > > > > crash> log | grep "memreg failure: memor" | wc -l
> > > > > > 2414
> > > > > > 
> > > > > > crash> log
> > > > > > [1635578.012721]  connection16:0: detected conn error
> > > > > > (1011)
> > > > > > [1635587.050688] mlx5_0:dump_cqe:262:(pid 93128): dump
> > > > > > error
> > > > > > cqe
> > > > > > [1635587.089686] 00000000 00000000 00000000 00000000
> > > > > > [1635587.123989] 00000000 00000000 00000000 00000000
> > > > > > [1635587.157494] 00000000 00000000 00000000 00000000
> > > > > > [1635587.190968] 00000000 08007806 250002ad ba6115d3
> > > > > > 
> > > > > > [1635587.224331] iser: iser_err_comp: memreg failure:
> > > > > > memory
> > > > > > management
> > > > > > operation error (6) vend_err 78
> > > > > > [1635587.278876]  connection15:0: detected conn error
> > > > > > (1011)
> > > > > > [1635590.986286] mlx5_1:dump_cqe:262:(pid 0): dump error
> > > > > > cqe
> > > > > > [1635591.021891] 00000000 00000000 00000000 00000000
> > > > > > [1635591.053944] 00000000 00000000 00000000 00000000
> > > > > > 
> > > > > > [1657077.997960] BUG: unable to handle kernel NULL pointer
> > > > > > dereference
> > > > > > at 0000000000000010
> > > > > > [1657077.997967] IP: [<ffffffffc08a541e>]
> > > > > > iscsi_verify_itt+0x1e/0x110
> > > > > > [libiscsi]
> > > > > > [1657077.997970] PGD 80000098de387067 PUD b8d9ffa067 PMD 0
> > > > > > [1657077.997971] Oops: 0000 [#1] SMP
> > > > > > [1657077.998009] Modules linked in: oracleasm(O) nfsv3
> > > > > > rpcsec_gss_krb5
> > > > > > nfsv4 dns_resolver nfs fscache dm_round_robin bonding
> > > > > > rpcrdma
> > > > > > ib_isert
> > > > > > iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi
> > > > > > ib_srpt
> > > > > > target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib
> > > > > > rdma_ucm
> > > > > > ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm mlx5_ib
> > > > > > ib_core
> > > > > > vfat
> > > > > > fat
> > > > > > xfs sb_edac edac_core intel_powerclamp coretemp intel_rapl
> > > > > > iosf_mbi
> > > > > > kvm_intel kvm irqbypass iTCO_wdt crc32_pclmul ipmi_ssif
> > > > > > iTCO_vendor_support ghash_clmulni_intel aesni_intel lrw
> > > > > > gf128mul
> > > > > > ipmi_si glue_helper ablk_helper cryptd sg hpwdt hpilo
> > > > > > pcspkr
> > > > > > ipmi_devintf ioatdma dm_multipath i2c_i801 lpc_ich shpchp
> > > > > > dca
> > > > > > wmi
> > > > > > ipmi_msghandler pcc_cpufreq acpi_power_meter nfsd
> > > > > > binfmt_misc
> > > > > > auth_rpcgss nfs_acl lockd grace sunrpc ip_tables ext4
> > > > > > mbcache
> > > > > > jbd2
> > > > > > sd_mod crc_t10dif crct10dif_generic
> > > > > > [1657077.998020]  i2c_algo_bit drm_kms_helper syscopyarea
> > > > > > sysfillrect
> > > > > > sysimgblt fb_sys_fops ttm bnx2x mlx5_core crct10dif_pclmul
> > > > > > mdio
> > > > > > tg3(OE)
> > > > > > devlink libcrc32c crct10dif_common drm hpsa(OE) ptp
> > > > > > i2c_core
> > > > > > crc32c_intel scsi_transport_sas pps_core dm_mirror
> > > > > > dm_region_hash
> > > > > > dm_log dm_mod
> > > > > > [1657077.998023] CPU: 20 PID: 41538 Comm: sh Tainted:
> > > > > > G           OE  -
> > > > > > -----------   3.10.0-693.34.1.el7_bz1582551.x86_64 #1
> > > > > > [1657077.998024] Hardware name: HP ProLiant DL380
> > > > > > Gen9/ProLiant
> > > > > > DL380
> > > > > > Gen9, BIOS P89 05/21/2018
> > > > > > [1657077.998025] task: ffff88587ce38fd0 ti:
> > > > > > ffff884dd0af0000
> > > > > > task.ti:
> > > > > > ffff884dd0af0000
> > > > > > [1657077.998029] RIP:
> > > > > > 0010:[<ffffffffc08a541e>]  [<ffffffffc08a541e>]
> > > > > > iscsi_verify_itt+0x1e/0x110 [libiscsi]
> > > > > > [1657077.998030] RSP: 0000:ffff88beff403d78  EFLAGS:
> > > > > > 00010286
> > > > > > [1657077.998031] RAX: 000000000000004c RBX:
> > > > > > 00000000b0000036
> > > > > > RCX:
> > > > > > 0000000000000002
> > > > > > [1657077.998032] RDX: 00000000000000cc RSI:
> > > > > > 00000000b0000036
> > > > > > RDI:
> > > > > > 0000000000000000
> > > > > > [1657077.998033] RBP: ffff88beff403da0 R08:
> > > > > > 0000000040032a20
> > > > > > R09:
> > > > > > ffff8896e4eaf91c
> > > > > > [1657077.998034] R10: 0000000000000000 R11:
> > > > > > 00007ffff7763ca0
> > > > > > R12:
> > > > > > 0000000000000000
> > > > > > [1657077.998035] R13: ffff8896e4eaf9e4 R14:
> > > > > > ffff8896e4eaf900
> > > > > > R15:
> > > > > > 0000000000000000
> > > > > > [1657077.998036] FS:  00007ffff7fe6740(0000)
> > > > > > GS:ffff88beff400000(0000)
> > > > > > knlGS:0000000000000000
> > > > > > [1657077.998038] CS:  0010 DS: 0000 ES: 0000 CR0:
> > > > > > 0000000080050033
> > > > > > [1657077.998039] CR2: 0000000000000010 CR3:
> > > > > > 000000ad92eba000
> > > > > > CR4:
> > > > > > 00000000003607e0
> > > > > > [1657077.998040] DR0: 0000000000000000 DR1:
> > > > > > 0000000000000000
> > > > > > DR2:
> > > > > > 0000000000000000
> > > > > > [1657077.998041] DR3: 0000000000000000 DR6:
> > > > > > 00000000fffe0ff0
> > > > > > DR7:
> > > > > > 0000000000000400
> > > > > > [1657077.998042] Call Trace:
> > > > > > [1657077.998044]  <IRQ>
> > > > > > [1657077.998046]  [<ffffffffc08a5527>]
> > > > > > iscsi_itt_to_ctask+0x17/0x80
> > > > > > [libiscsi]
> > > > > > [1657077.998050]  [<ffffffffc05eefea>]
> > > > > > iser_task_rsp+0xca/0x360
> > > > > > [ib_iser]
> > > > > > [1657077.998061]  [<ffffffffc0587fbb>]
> > > > > > __ib_process_cq+0x6b/0xe0
> > > > > > [ib_core]
> > > > > > [1657077.998066]  [<ffffffffc0588122>]
> > > > > > ib_poll_handler+0x22/0x80
> > > > > > [ib_core]
> > > > > > [1657077.998070]  [<ffffffff81358507>]
> > > > > > irq_poll_softirq+0xc7/0x100
> > > > > > [1657077.998076]  [<ffffffff81095195>]
> > > > > > __do_softirq+0xf5/0x280
> > > > > > [1657077.998081]  [<ffffffff816c4e8c>]
> > > > > > call_softirq+0x1c/0x30
> > > > > > [1657077.998086]  [<ffffffff8102d435>] do_softirq+0x65/0xa0
> > > > > > [1657077.998088]  [<ffffffff81095515>] irq_exit+0x105/0x110
> > > > > > [1657077.998091]  [<ffffffff816c61d6>] do_IRQ+0x56/0xf0
> > > > > > [1657077.998098]  [<ffffffff816b837c>]
> > > > > > common_interrupt+0x17c/0x17c
> > > > > > [1657077.998099]  <EOI>
> > > > > > [1657077.998113] Code: ff ff ff eb a9 41 be 95 ff ff ff eb
> > > > > > a1
> > > > > > 0f
> > > > > > 1f
> > > > > > 44
> > > > > > 00 00 55 48 89 e5 41 55 41 54 49 89 fc 53 89 f3 48 83 ec 10
> > > > > > c7
> > > > > > 45
> > > > > > d8 00
> > > > > > 00 00 00 <4c> 8b 6f 10 65 48 8b 04 25 28 00 00 00 48 89 45
> > > > > > e0
> > > > > > 31
> > > > > > c0
> > > > > > 83
> > > > > > fe
> > > > > > [1657077.998116] RIP  [<ffffffffc08a541e>]
> > > > > > iscsi_verify_itt+0x1e/0x110
> > > > > > [libiscsi]
> > > > > > [1657077.998116]  RSP <ffff88beff403d78>
> > > > > > [1657077.998117] CR2: 0000000000000010
> > > > > > crash>
> > > > > > 
> > > > > > crash> bt
> > > > > > PID: 41538  TASK: ffff88587ce38fd0  CPU: 20  COMMAND: "sh"
> > > > > >   #0 [ffff88beff403a18] machine_kexec at ffffffff8105ddeb
> > > > > >   #1 [ffff88beff403a78] __crash_kexec at ffffffff81109902
> > > > > >   #2 [ffff88beff403b48] crash_kexec at ffffffff811099f0
> > > > > >   #3 [ffff88beff403b60] oops_end at ffffffff816b97a8
> > > > > >   #4 [ffff88beff403b88] no_context at ffffffff816a8c96
> > > > > >   #5 [ffff88beff403bd8] __bad_area_nosemaphore at
> > > > > > ffffffff816a8d2c
> > > > > >   #6 [ffff88beff403c20] bad_area_nosemaphore at
> > > > > > ffffffff816a8e96
> > > > > >   #7 [ffff88beff403c30] __do_page_fault at ffffffff816bc6be
> > > > > >   #8 [ffff88beff403c90] do_page_fault at ffffffff816bc865
> > > > > >   #9 [ffff88beff403cc0] page_fault at ffffffff816b8788
> > > > > >      [exception RIP: iscsi_verify_itt+30]
> > > > > >      RIP: ffffffffc08a541e  RSP: ffff88beff403d78  RFLAGS:
> > > > > > 00010286
> > > > > >      RAX: 000000000000004c  RBX: 00000000b0000036  RCX:
> > > > > > 0000000000000002
> > > > > >      RDX: 00000000000000cc  RSI: 00000000b0000036  RDI:
> > > > > > 0000000000000000
> > > > > >      RBP: ffff88beff403da0   R8: 0000000040032a20   R9:
> > > > > > ffff8896e4eaf91c
> > > > > >      R10: 0000000000000000  R11: 00007ffff7763ca0  R12:
> > > > > > 0000000000000000
> > > > > >      R13: ffff8896e4eaf9e4  R14: ffff8896e4eaf900  R15:
> > > > > > 0000000000000000
> > > > > >      ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0000
> > > > > > #10 [ffff88beff403da8] iscsi_itt_to_ctask at
> > > > > > ffffffffc08a5527
> > > > > > [libiscsi]
> > > > > > #11 [ffff88beff403dc8] iser_task_rsp at ffffffffc05eefea
> > > > > > [ib_iser]
> > > > > > #12 [ffff88beff403e10] __ib_process_cq at ffffffffc0587fbb
> > > > > > [ib_core]
> > > > > > #13 [ffff88beff403e50] ib_poll_handler at ffffffffc0588122
> > > > > > [ib_core]
> > > > > > #14 [ffff88beff403e80] irq_poll_softirq at ffffffff81358507
> > > > > > #15 [ffff88beff403eb8] __do_softirq at ffffffff81095195
> > > > > > #16 [ffff88beff403f28] call_softirq at ffffffff816c4e8c
> > > > > > #17 [ffff88beff403f40] do_softirq at ffffffff8102d435
> > > > > > #18 [ffff88beff403f60] irq_exit at ffffffff81095515
> > > > > > #19 [ffff88beff403f78] do_IRQ at ffffffff816c61d6
> > > > > > --- <IRQ stack> ---
> > > > > > #20 [ffff884dd0af3f58] ret_from_intr at ffffffff816b837c
> > > > > >      RIP: 000000000041b866  RSP: 00007fffffffea28  RFLAGS:
> > > > > > 00000206
> > > > > >      RAX: 0000000000000000  RBX: 00007fffffffef53  RCX:
> > > > > > 00000000006f1a70
> > > > > >      RDX: 00000000006f1a70  RSI: 00000000006f1a90  RDI:
> > > > > > 0000000000000000
> > > > > >      RBP: 0000000000000002   R8: 0000000000000001   R9:
> > > > > > 0000000000000020
> > > > > >      R10: 0000000000000003  R11: 00007ffff7763ca0  R12:
> > > > > > ffff88beff4061e8
> > > > > >      R13: 00000000ffffffff  R14: 0000000000000000  R15:
> > > > > > 0000000000000063
> > > > > >      ORIG_RAX: ffffffffffffffbb  CS: 0033  SS: 002b
> > > > > > 
> > > > > > crash> ps -p 41538
> > > > > > PID: 0      TASK: ffffffff81a0e480  CPU: 0   COMMAND:
> > > > > > "swapper/0"
> > > > > >   PID: 1      TASK: ffff88012e4c8000  CPU: 7   COMMAND:
> > > > > > "systemd"
> > > > > >    PID: 2345   TASK: ffff885ef5eb8fd0  CPU: 14  COMMAND:
> > > > > > "zabbix_agentd"
> > > > > >     PID: 2349   TASK: ffff885efcbcaf70  CPU: 1   COMMAND:
> > > > > > "zabbix_agentd"
> > > > > >      PID: 41538  TASK: ffff88587ce38fd0  CPU: 20  COMMAND:
> > > > > > "sh"
> > > > > > 
> > > > > 
> > > > > 
> > > > 
> > > > Don
> > > > I misspoke about the kernel version, its 7.4 
> > > > 3.10.0-693.34.1.el7_bz1582551.x86_64
> > > > Its the one we added the missing iscsi patches to but base is
> > > > 7.4
> > > > So I will test with 7.5
> > > > 
> > > 
> > > Don, I had another look at this.
> > > 
> > > Its not the SG_GAPS issue causing a memory registration error I
> > > reported and we fixed in 7.5 from upstream.
> > > 
> > > Which commit in 7.5 did we pull in for fix this from upstream.
> > > 
> > > I think this is different and not yet fixed ??
> > > 
> > > [14556.614551] iser: iser_err_comp: memreg failure: memory
> > > management
> > > operation error (6) vend_err 78
> > > [14556.666134]  connection1:0: detected conn error (1011)
> > > [14562.678414] mlx5_1:dump_cqe:262:(pid 0): dump error cqe
> > > [14562.678529] mlx5_0:dump_cqe:262:(pid 0): dump error cqe
> > > [14562.678530] 00000000 00000000 00000000 00000000
> > > [14562.678531] 00000000 00000000 00000000 00000000
> > > [14562.678531] 00000000 00000000 00000000 00000000
> > > [14562.678532] 00000000 08007806 25000344 34681cd2
> > > [14562.678535] iser: iser_err_comp: memreg failure: memory
> > > management
> > > operation error (6) vend_err 78
> > > [14562.678544]  connection1:0: detected conn error (1011)
> > > [14562.679098] BUG: unable to handle kernel NULL pointer
> > > dereference
> > > at
> > > 0000000000000010
> > > [14562.679105] IP: [<ffffffffc088141e>]
> > > iscsi_verify_itt+0x1e/0x110
> > > [libiscsi]
> > > [14562.679106] PGD 0
> > > [14562.679107] Oops: 0000 [#1] SMP
> > > [14562.679134] Modules linked in: ip6table_filter ip6_tables
> > > iptable_filter sctp_diag sctp tcp_diag udp_diag inet_diag
> > > unix_diag
> > > af_packet_diag netlink_diag bnx2i cnic uio ip_vs nf_conntrack
> > > oracleadvm(POE) oracleoks(POE) oracleasm(O) nfsv3 rpcsec_gss_krb5
> > > nfsv4
> > > dns_resolver nfs fscache dm_round_robin bonding rpcrdma ib_isert
> > > iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt
> > > target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib
> > > rdma_ucm
> > > ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm mlx5_ib ib_core xfs
> > > vfat
> > > fat sb_edac edac_core intel_powerclamp coretemp intel_rapl
> > > iosf_mbi
> > > kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel
> > > aesni_intel
> > > lrw gf128mul glue_helper ablk_helper cryptd iTCO_wdt
> > > iTCO_vendor_support ipmi_ssif pcspkr ipmi_si dm_multipath ioatdma
> > > lpc_ich i2c_i801 sg hpilo
> > > [14562.679152]  hpwdt dca ipmi_devintf ipmi_msghandler
> > > pcc_cpufreq
> > > shpchp wmi acpi_power_meter binfmt_misc nfsd auth_rpcgss nfs_acl
> > > lockd
> > > grace sunrpc ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif
> > > crct10dif_generic i2c_algo_bit drm_kms_helper syscopyarea
> > > sysfillrect
> > > sysimgblt fb_sys_fops ttm bnx2x mlx5_core devlink mdio tg3(OE)
> > > libcrc32c drm crct10dif_pclmul hpsa(OE) crct10dif_common ptp
> > > i2c_core
> > > crc32c_intel scsi_transport_sas pps_core dm_mirror dm_region_hash
> > > dm_log dm_mod
> > > [14562.679154] CPU: 9 PID: 0 Comm: swapper/9 Tainted:
> > > P           OE  -
> > > -----------   3.10.0-693.22.1.el7.x86_64 #1
> > > [14562.679155] Hardware name: HP ProLiant DL380 Gen9/ProLiant
> > > DL380
> > > Gen9, BIOS P89 05/21/2018
> > > [14562.679156] task: ffff8860aefaaf70 ti: ffff8860ae440000
> > > task.ti:
> > > ffff8860ae440000
> > > [14562.679158] RIP:
> > > 0010:[<ffffffffc088141e>]  [<ffffffffc088141e>]
> > > iscsi_verify_itt+0x1e/0x110 [libiscsi]
> > > [14562.679159] RSP: 0018:ffff88beff2c3d78  EFLAGS: 00010286
> > > [14562.679160] RAX: 000000000000004c RBX: 00000000d0000041 RCX:
> > > 0000000000000002
> > > [14562.679161] RDX: 00000000000000cc RSI: 00000000d0000041 RDI:
> > > 0000000000000000
> > > [14562.679161] RBP: ffff88beff2c3da0 R08: 0000000040001038 R09:
> > > ffff88ae496fe01c
> > > [14562.679162] R10: 0000000000000000 R11: 7fffffffffffffff R12:
> > > 0000000000000000
> > > [14562.679162] R13: ffff88ae496fe0e4 R14: ffff88ae496fe000 R15:
> > > 0000000000000000
> > > [14562.679163] FS:  0000000000000000(0000)
> > > GS:ffff88beff2c0000(0000)
> > > knlGS:0000000000000000
> > > [14562.679164] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > [14562.679164] CR2: 0000000000000010 CR3: 000000beede48000 CR4:
> > > 00000000003607e0
> > > [14562.679165] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> > > 0000000000000000
> > > [14562.679166] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
> > > 0000000000000400
> > > [14562.679166] Call Trace:
> > > [14562.679168]  <IRQ>
> > > [14562.679170]  [<ffffffffc0881527>] iscsi_itt_to_ctask+0x17/0x80
> > > [libiscsi]
> > > [14562.679173]  [<ffffffffc069ffea>] iser_task_rsp+0xca/0x360
> > > [ib_iser]
> > > [14562.679181]  [<ffffffffc0924fbb>] __ib_process_cq+0x6b/0xe0
> > > [ib_core]
> > 
> > Starts with the memreg failures
> > crash> log | grep "iser: iser_err_comp: memreg failure" | wc -l
> > 1237
> > 
> > Then the panic
> > 
> > [14556.614551] iser: iser_err_comp: memreg failure: memory
> > management
> > operation error (6) vend_err 78
> > [14556.666134]  connection1:0: detected conn error (1011)
> > [14562.678414] mlx5_1:dump_cqe:262:(pid 0): dump error cqe
> > [14562.678529] mlx5_0:dump_cqe:262:(pid 0): dump error cqe
> > [14562.678530] 00000000 00000000 00000000 00000000
> > [14562.678531] 00000000 00000000 00000000 00000000
> > [14562.678531] 00000000 00000000 00000000 00000000
> > [14562.678532] 00000000 08007806 25000344 34681cd2
> > [14562.678535] iser: iser_err_comp: memreg failure: memory
> > management
> > operation error (6) vend_err 78
> > [14562.678544]  connection1:0: detected conn error (1011)
> > 
> > [14562.679098] BUG: unable to handle kernel NULL pointer
> > dereference
> > at
> > 0000000000000010
> > [14562.679105] IP: [<ffffffffc088141e>] iscsi_verify_itt+0x1e/0x110
> > [libiscsi]
> > 
> > crash> bt
> > PID: 0      TASK: ffff8860aefaaf70  CPU: 9   COMMAND: "swapper/9"
> >  #0 [ffff88beff2c3a18] machine_kexec at ffffffff8105d77b
> >  #1 [ffff88beff2c3a78] __crash_kexec at ffffffff81108732
> >  #2 [ffff88beff2c3b48] crash_kexec at ffffffff81108820
> >  #3 [ffff88beff2c3b60] oops_end at ffffffff816b8778
> >  #4 [ffff88beff2c3b88] no_context at ffffffff816a7c7a
> >  #5 [ffff88beff2c3bd8] __bad_area_nosemaphore at ffffffff816a7d10
> >  #6 [ffff88beff2c3c20] bad_area_nosemaphore at ffffffff816a7e7a
> >  #7 [ffff88beff2c3c30] __do_page_fault at ffffffff816bb68e
> >  #8 [ffff88beff2c3c90] do_page_fault at ffffffff816bb835
> >  #9 [ffff88beff2c3cc0] page_fault at ffffffff816b7768
> >     [exception RIP: iscsi_verify_itt+30]
> >     RIP: ffffffffc088141e  RSP: ffff88beff2c3d78  RFLAGS: 00010286
> >     RAX: 000000000000004c  RBX: 00000000d0000041  RCX:
> > 0000000000000002
> >     RDX: 00000000000000cc  RSI: 00000000d0000041  RDI:
> > 0000000000000000
> >     RBP: ffff88beff2c3da0   R8: 0000000040001038   R9:
> > ffff88ae496fe01c
> >     R10: 0000000000000000  R11: 7fffffffffffffff  R12:
> > 0000000000000000
> >     R13: ffff88ae496fe0e4  R14: ffff88ae496fe000  R15:
> > 0000000000000000
> >     ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
> > #10 [ffff88beff2c3da8] iscsi_itt_to_ctask at ffffffffc0881527
> > [libiscsi]
> > #11 [ffff88beff2c3dc8] iser_task_rsp at ffffffffc069ffea [ib_iser]
> > #12 [ffff88beff2c3e10] __ib_process_cq at ffffffffc0924fbb
> > [ib_core]
> > #13 [ffff88beff2c3e50] ib_poll_handler at ffffffffc0925122
> > [ib_core]
> > #14 [ffff88beff2c3e80] irq_poll_softirq at ffffffff813572b7
> > #15 [ffff88beff2c3eb8] __do_softirq at ffffffff81094035
> > #16 [ffff88beff2c3f28] call_softirq at ffffffff816c3afc
> > #17 [ffff88beff2c3f40] do_softirq at ffffffff8102d435
> > #18 [ffff88beff2c3f60] irq_exit at ffffffff810943b5
> > #19 [ffff88beff2c3f78] do_IRQ at ffffffff816c4d96
> > --- <IRQ stack> ---
> > #20 [ffff8860ae443db8] ret_from_intr at ffffffff816b7362
> >     [exception RIP: cpuidle_enter_state+87]
> >     RIP: ffffffff81530b07  RSP: ffff8860ae443e60  RFLAGS: 00000202
> >     RAX: 00000d3e7d729de6  RBX: ffff8860ae443e40  RCX:
> > 0000000000000018
> >     RDX: 0000000225c17d03  RSI: ffff8860ae443fd8  RDI:
> > 00000d3e7d729de6
> >     RBP: ffff8860ae443e88   R8: 000000000000016c   R9:
> > 000000000000001c
> >     R10: 0000000000000043  R11: 7fffffffffffffff  R12:
> > 0000000000000009
> >     R13: ffff88beff2d39a0  R14: ffffffff810b77e5  R15:
> > ffff8860ae443de0
> >     ORIG_RAX: ffffffffffffff5d  CS: 0010  SS: 0018
> > #21 [ffff8860ae443e90] cpuidle_idle_call at ffffffff81530c5e
> > #22 [ffff8860ae443ed0] arch_cpu_idle at ffffffff81034f8e
> > #23 [ffff8860ae443ee0] cpu_startup_entry at ffffffff810eb6da
> > #24 [ffff8860ae443f28] start_secondary at ffffffff81052222
> > 
> > crash> dis -l iscsi_verify_itt+30
> > /usr/src/debug/kernel-3.10.0-693.22.1.el7/linux-3.10.0-
> > 693.22.1.el7.x86_64/drivers/scsi/libiscsi.c: 1292
> > 0xffffffffc088141e
> > <iscsi_verify_itt+30>:       mov    0x10(%rdi),%r13
> > crash> 
> > 
> > 
> > So fails here
> > 
> > int iscsi_verify_itt(struct iscsi_conn *conn, itt_t itt)
> > {
> >         struct iscsi_session *session = conn->session;  **** conn-
> > > session is invalid
> > 
> > rdi had the struct iscsi_conn 
> > 
> > 0xffffffffc0881400 <iscsi_verify_itt>:  nopl   0x0(%rax,%rax,1)
> > [FTRACE
> > NOP]
> > 0xffffffffc0881405 <iscsi_verify_itt+5>:        push   %rbp
> > 0xffffffffc0881406 <iscsi_verify_itt+6>:        mov    %rsp,%rbp
> > 0xffffffffc0881409 <iscsi_verify_itt+9>:        push   %r13
> > 0xffffffffc088140b <iscsi_verify_itt+11>:       push   %r12
> > 0xffffffffc088140d <iscsi_verify_itt+13>:       mov    %rdi,%r12
> > 0xffffffffc0881410 <iscsi_verify_itt+16>:       push   %rbx
> > 0xffffffffc0881411 <iscsi_verify_itt+17>:       mov    %esi,%ebx
> > 0xffffffffc0881413 <iscsi_verify_itt+19>:       sub    $0x10,%rsp
> > 0xffffffffc0881417 <iscsi_verify_itt+23>:       movl   $0x0,-
> > 0x28(%rbp)
> > 0xffffffffc088141e
> > <iscsi_verify_itt+30>:       mov    0x10(%rdi),%r13
> > 
> >    RIP: ffffffffc088141e  RSP: ffff88beff2c3d78  RFLAGS: 00010286
> >     RAX: 000000000000004c  RBX: 00000000d0000041  RCX:
> > 0000000000000002
> >     RDX: 00000000000000cc  RSI: 00000000d0000041  RDI:
> > 0000000000000000
> >     RBP: ffff88beff2c3da0   R8: 0000000040001038   R9:
> > ffff88ae496fe01c
> >     R10: 0000000000000000  R11: 7fffffffffffffff  R12:
> > 0000000000000000
> >     R13: ffff88ae496fe0e4  R14: ffff88ae496fe000  R15:
> > 0000000000000000
> > 
> > Both RDI and R12 are null, offset by 10 get the bad address
> > 
> > So we have a race somehow that trashes the conn pointer under load.
> > 
> > The load clearly is seeing resource issues and repeatedly failing
> > the
> > memory registration.
> 
> So as I expected the memreg issues are gone won 7.5 which was rebased
> against upstream.
> 
> We are now hitting this and I am unable to reproduce in-house after
> multiple efforts.
> 
> Aug  7 06:47:30 xxxxxxx kernel: WARNING: CPU: 20 PID: 36881 at
> lib/list_debug.c:36 __list_add+0x8a/0xc0
> Aug  7 06:47:30 xxxxxxx kernel: list_add double add:
> new=ffff9f01523b92c8, prev=ffff9f01523b92c8, next=ffff9f69e4216d88.
> Aug  7 06:47:30 xxxxxxx kernel: Modules linked in: bnx2i cnic uio
> ip_vs
> nf_conntrack ip6table_filter ip6_tables iptable_filter tcp_diag
> udp_diag inet_diag unix_diag af_packet_diag n
> etlink_diag oracleadvm(POE) oracleoks(POE) oracleasm(O) nfsv3
> rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache dm_round_robin bonding
> rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi sc
> si_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp
> scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm
> iw_cm
> mlx5_ib ib_core vfat fat xfs sb_edac intel_p
> owerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass
> crc32_pclmul ghash_clmulni_intel iTCO_wdt aesni_intel ipmi_ssif lrw
> iTCO_vendor_support gf128mul glue_helper ablk_helper i
> oatdma cryptd ipmi_si pcspkr joydev ipmi_devintf hpwdt i2c_i801 hpilo
> sg lpc_ich wmi dca ipmi_msghandler
> Aug  7 06:47:30 xxxxxxx kernel: acpi_power_meter pcc_cpufreq shpchp
> dm_multipath binfmt_misc nfsd auth_rpcgss nfs_acl lockd grace sunrpc
> ip_tables ext4 mbcache jbd2 sd_mod crc_t10di
> f crct10dif_generic i2c_algo_bit drm_kms_helper syscopyarea
> sysfillrect
> sysimgblt fb_sys_fops mlx5_core ttm mlxfw drm bnx2x tg3 devlink mdio
> crct10dif_pclmul libcrc32c crct10dif_common 
> hpsa ptp i2c_core crc32c_intel scsi_transport_sas pps_core dm_mirror
> dm_region_hash dm_log dm_mod
> Aug  7 06:47:30 xxxxxxx kernel: CPU: 20 PID: 36881 Comm: sh Tainted:
> P        W  OE  ------------   3.10.0-862.9.1.el7.x86_64 #1
> Aug  7 06:47:30 xxxxxxx kernel: Hardware name: HP ProLiant DL380
> Gen9/ProLiant DL380 Gen9, BIOS P89 05/21/2018
> Aug  7 06:47:30 xxxxxxx kernel: Call Trace:
> Aug  7 06:47:30 xxxxxxx kernel: <IRQ>  [<ffffffffa650e84e>]
> dump_stack+0x19/0x1b
> Aug  7 06:47:30 xxxxxxx kernel: [<ffffffffa5e91e18>]
> __warn+0xd8/0x100
> Aug  7 06:47:30 xxxxxxx kernel: [<ffffffffa5e91e9f>]
> warn_slowpath_fmt+0x5f/0x80
> Aug  7 06:47:30 xxxxxxx kernel: [<ffffffffa6168d8a>]
> __list_add+0x8a/0xc0
> Aug  7 06:47:30 xxxxxxx kernel: [<ffffffffc0ac3c75>]
> ipoib_start_xmit+0x485/0x6d0 [ib_ipoib]
> Aug  7 06:47:30 xxxxxxx kernel: [<ffffffffa63ec226>]
> dev_hard_start_xmit+0x246/0x3b0
> Aug  7 06:47:30 xxxxxxx kernel: [<ffffffffa6417aba>]
> sch_direct_xmit+0x11a/0x250
> Aug  7 06:47:30 xxxxxxx kernel: [<ffffffffa63ef111>]
> __dev_queue_xmit+0x4a1/0x660
> Aug  7 06:47:30 xxxxxxx kernel: [<ffffffffa63ef2e0>]
> dev_queue_xmit+0x10/0x20
> Aug  7 06:47:30 xxxxxxx kernel: [<ffffffffa63fad1d>]
> neigh_resolve_output+0x11d/0x220
> Aug  7 06:47:30 xxxxxxx kernel: [<ffffffffa60db10a>] ?
> selinux_ipv4_postroute+0x1a/0x20
> Aug  7 06:47:30 xxxxxxx kernel: [<ffffffffa643820c>]
> ip_finish_output+0x2ac/0x7a0
> Aug  7 06:47:30 xxxxxxx kernel: [<ffffffffa6438a03>]
> ip_output+0x73/0xe0
> Aug  7 06:47:30 xxxxxxx kernel: [<ffffffffa6437f60>] ?
> __ip_append_data.isra.50+0xa50/0xa50
> Aug  7 06:47:30 xxxxxxx kernel: [<ffffffffa64365f7>]
> ip_local_out_sk+0x37/0x40
> Aug  7 06:47:30 xxxxxxx kernel: [<ffffffffa6436963>]
> ip_queue_xmit+0x143/0x3a0
> Aug  7 06:47:30 xxxxxxx kernel: [<ffffffffa6450844>]
> tcp_transmit_skb+0x4e4/0x9e0
> Aug  7 06:47:30 xxxxxxx kernel: [<ffffffffa64528bf>]
> tcp_send_ack+0x11f/0x170
> Aug  7 06:47:30 xxxxxxx kernel: [<ffffffffa6445735>]
> tcp_send_dupack+0x25/0xd0
> Aug  7 06:47:30 xxxxxxx kernel: [<ffffffffa644ce86>]
> tcp_validate_incoming+0x186/0x2d0
> Aug  7 06:47:30 xxxxxxx kernel: [<ffffffffa644d18d>]
> tcp_rcv_established+0x1bd/0x770
> Aug  7 06:47:30 xxxxxxx kernel: [<ffffffffa6457e6a>]
> tcp_v4_do_rcv+0x10a/0x350
> Aug  7 06:47:30 xxxxxxx kernel: [<ffffffffa64595fc>]
> tcp_v4_rcv+0x78c/0x990
> Aug  7 06:47:30 xxxxxxx kernel: [<ffffffffc0feafc6>] ?
> ip_vs_remote_request4+0x16/0x20 [ip_vs]
> Aug  7 06:47:30 xxxxxxx kernel: [<ffffffffa643272d>]
> ip_local_deliver_finish+0xbd/0x200
> Aug  7 06:47:30 xxxxxxx kernel: [<ffffffffa6432a19>]
> ip_local_deliver+0x59/0xd0
> Aug  7 06:47:30 xxxxxxx kernel: [<ffffffffa6432670>] ?
> ip_rcv_finish+0x370/0x370
> Aug  7 06:47:30 xxxxxxx kernel: [<ffffffffa6432390>]
> ip_rcv_finish+0x90/0x370
> Aug  7 06:47:30 xxxxxxx kernel: [<ffffffffa6432d49>]
> ip_rcv+0x2b9/0x410
> Aug  7 06:47:30 xxxxxxx kernel: [<ffffffffa6123411>] ?
> blk_complete_request+0x21/0x30
> Aug  7 06:47:30 xxxxxxx kernel: [<ffffffffa63ecab9>]
> __netif_receive_skb_core+0x729/0xa20
> Aug  7 06:47:30 xxxxxxx kernel: [<ffffffffa63ecdc8>]
> __netif_receive_skb+0x18/0x60
> Aug  7 06:47:30 xxxxxxx kernel: [<ffffffffa63ece50>]
> netif_receive_skb_internal+0x40/0xc0
> Aug  7 06:47:30 xxxxxxx kernel: [<ffffffffa63eda78>]
> napi_gro_receive+0xd8/0x100
> Aug  7 06:47:30 xxxxxxx kernel: [<ffffffffc0983183>]
> mlx5i_handle_rx_cqe+0x2a3/0x460 [mlx5_core]
> Aug  7 06:47:30 xxxxxxx kernel: [<ffffffffc09826f8>]
> mlx5e_poll_rx_cq+0xc8/0x8b0 [mlx5_core]
> Aug  7 06:47:30 xxxxxxx kernel: [<ffffffffc0983909>]
> mlx5e_napi_poll+0x99/0x280 [mlx5_core]
> Aug  7 06:47:30 xxxxxxx kernel: [<ffffffffa63ed46f>]
> net_rx_action+0x26f/0x390
> Aug  7 06:47:30 xxxxxxx kernel: [<ffffffffa5e9b085>]
> __do_softirq+0xf5/0x280
> Aug  7 06:47:30 xxxxxxx kernel: [<ffffffffa6523cec>]
> call_softirq+0x1c/0x30
> Aug  7 06:47:30 xxxxxxx kernel: [<ffffffffa5e2d625>]
> do_softirq+0x65/0xa0
> Aug  7 06:47:30 xxxxxxx kernel: [<ffffffffa5e9b405>]
> irq_exit+0x105/0x110
> Aug  7 06:47:30 xxxxxxx kernel: [<ffffffffa6524f86>] do_IRQ+0x56/0xf0
> Aug  7 06:47:30 xxxxxxx kernel: [<ffffffffa6517362>]
> common_interrupt+0x162/0x162
> Aug  7 06:47:30 xxxxxxx kernel: <EOI>  [<ffffffffa5fc12d5>] ?
> do_read_fault.isra.60+0x5/0x1a0
> Aug  7 06:47:30 xxxxxxx kernel: [<ffffffffa5fc5a9c>] ?
> handle_pte_fault+0x2dc/0xc30
> Aug  7 06:47:30 xxxxxxx kernel: [<ffffffffa5fc7c3d>]
> handle_mm_fault+0x39d/0x9b0
> Aug  7 06:47:30 xxxxxxx kernel: [<ffffffffa651b547>]
> __do_page_fault+0x197/0x4f0
> Aug  7 06:47:30 xxxxxxx kernel: [<ffffffffa651b8d5>]
> do_page_fault+0x35/0x90
> Aug  7 06:47:30 xxxxxxx kernel: [<ffffffffa6517758>]
> page_fault+0x28/0x30
> Aug  7 06:47:30 xxxxxxx kernel: ---[ end trace 020d3cfb07217435 ]---
> 
> Then this very soon after
> 
> Aug  7 06:47:48 xxxxxxx kernel: ------------[ cut here ]------------
> Aug  7 06:47:48 xxxxxxx kernel: WARNING: CPU: 10 PID: 89058 at
> lib/list_debug.c:53 __list_del_entry+0x63/0xd0
> Aug  7 06:47:48 xxxxxxx kernel: list_del corruption,
> ffff9f6fba35bb70-
> > next is LIST_POISON1 (dead000000000100)
> 
> Aug  7 06:47:48 xxxxxxx kernel: Modules linked in: bnx2i cnic uio
> ip_vs
> nf_conntrack ip6table_filter ip6_tables iptable_filter tcp_diag
> udp_diag inet_diag unix_diag af_packet_diag n
> etlink_diag oracleadvm(POE) oracleoks(POE) oracleasm(O) nfsv3
> rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache dm_round_robin bonding
> rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi sc
> si_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp
> scsi_tgt ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm
> iw_cm
> mlx5_ib ib_core vfat fat xfs sb_edac intel_p
> owerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass
> crc32_pclmul ghash_clmulni_intel iTCO_wdt aesni_intel ipmi_ssif lrw
> iTCO_vendor_support gf128mul glue_helper ablk_helper i
> oatdma cryptd ipmi_si pcspkr joydev ipmi_devintf hpwdt i2c_i801 hpilo
> sg lpc_ich wmi dca ipmi_msghandler
> Aug  7 06:47:48 xxxxxxx kernel: acpi_power_meter pcc_cpufreq shpchp
> dm_multipath binfmt_misc nfsd auth_rpcgss nfs_acl lockd grace sunrpc
> ip_tables ext4 mbcache jbd2 sd_mod crc_t10di
> f crct10dif_generic i2c_algo_bit drm_kms_helper syscopyarea
> sysfillrect
> sysimgblt fb_sys_fops mlx5_core ttm mlxfw drm bnx2x tg3 devlink mdio
> crct10dif_pclmul libcrc32c crct10dif_common 
> hpsa ptp i2c_core crc32c_intel scsi_transport_sas pps_core dm_mirror
> dm_region_hash dm_log dm_mod
> Aug  7 06:47:48 xxxxxxx kernel: CPU: 10 PID: 89058 Comm: tnslsnr
> Tainted: P        W  OE  ------------   3.10.0-862.9.1.el7.x86_64 #1
> Aug  7 06:47:48 xxxxxxx kernel: Hardware name: HP ProLiant DL380
> Gen9/ProLiant DL380 Gen9, BIOS P89 05/21/2018
> Aug  7 06:47:48 xxxxxxx kernel: Call Trace:
> Aug  7 06:47:48 xxxxxxx kernel: [<ffffffffa650e84e>]
> dump_stack+0x19/0x1b
> Aug  7 06:47:48 xxxxxxx kernel: [<ffffffffa5e91e18>]
> __warn+0xd8/0x100
> Aug  7 06:47:48 xxxxxxx kernel: [<ffffffffa5e91e9f>]
> warn_slowpath_fmt+0x5f/0x80
> Aug  7 06:47:48 xxxxxxx kernel: [<ffffffffa6168e23>]
> __list_del_entry+0x63/0xd0
> Aug  7 06:47:48 xxxxxxx kernel: [<ffffffffa6168e9d>]
> list_del+0xd/0x30
> Aug  7 06:47:48 xxxxxxx kernel: [<ffffffffa5ebc226>]
> remove_wait_queue+0x26/0x40
> Aug  7 06:47:48 xxxxxxx kernel: [<ffffffffa6067a5a>]
> ep_unregister_pollwait.isra.6+0x3a/0x60
> Aug  7 06:47:48 xxxxxxx kernel: [<ffffffffa6067aa2>]
> ep_remove+0x22/0xc0
> Aug  7 06:47:48 xxxxxxx kernel: [<ffffffffa6068f1f>]
> SyS_epoll_ctl+0x4bf/0xc60
> Aug  7 06:47:48 xxxxxxx kernel: [<ffffffffa651b56c>] ?
> __do_page_fault+0x1bc/0x4f0
> Aug  7 06:47:48 xxxxxxx kernel: [<ffffffffa6520795>]
> system_call_fastpath+0x1c/0x21
> Aug  7 06:47:48 xxxxxxx kernel: ---[ end trace 020d3cfb07217438 ]---
> 
> 
> These started after 7.5
> 
> messages-20180806:Aug  4 20:10:54 xxxxxxx kernel: WARNING: CPU: 1
> PID:
> 48632 at lib/list_debug.c:59 __list_del_entry+0xa1/0xd0
> messages-20180806:Aug  4 20:10:54 xxxxxxx kernel: list_del
> corruption.
> prev->next should be ffff9f6991eb6648, but was           (null)
> messages-20180806:Aug  4 20:10:54 xxxxxxx kernel:
> [<ffffffffa6168e61>]
> __list_del_entry+0xa1/0xd0
> messages-20180806:Aug  5 00:25:39 xxxxxxx kernel: WARNING: CPU: 3
> PID:
> 84714 at lib/list_debug.c:59 __list_del_entry+0xa1/0xd0
> messages-20180806:Aug  5 00:25:39 xxxxxxx kernel: list_del
> corruption.
> prev->next should be ffff9f0bb12206c8, but was           (null)
> messages-20180806:Aug  5 00:25:39 xxxxxxx kernel:
> [<ffffffffa6168e61>]
> __list_del_entry+0xa1/0xd0
> messages-20180806:Aug  5 00:33:42 xxxxxxx kernel: WARNING: CPU: 4
> PID:
> 80177 at lib/list_debug.c:59 __list_del_entry+0xa1/0xd0
> messages-20180806:Aug  5 00:33:42 xxxxxxx kernel: list_del
> corruption.
> prev->next should be ffff9f69546a7ac8, but was dead000000000200
> messages-20180806:Aug  5 00:33:42 xxxxxxx kernel:
> [<ffffffffa6168e61>]
> __list_del_entry+0xa1/0xd0
> messages-20180806:Aug  5 00:40:14 xxxxxxx kernel: WARNING: CPU: 13
> PID:
> 80177 at lib/list_debug.c:59 __list_del_entry+0xa1/0xd0
> messages-20180806:Aug  5 00:40:14 xxxxxxx kernel: list_del
> corruption.
> prev->next should be ffff9f44a5b9f248, but was           (null)
> messages-20180806:Aug  5 00:40:14 xxxxxxx kernel:
> [<ffffffffa6168e61>]
> __list_del_entry+0xa1/0xd0
> messages-20180806:Aug  5 00:43:16 xxxxxxx kernel: WARNING: CPU: 0
> PID:
> 51133 at lib/list_debug.c:59 __list_del_entry+0xa1/0xd0
> messages-20180806:Aug  5 00:43:16 xxxxxxx kernel: list_del
> corruption.
> prev->next should be ffff9f6792776d48, but was           (null)
> messages-20180806:Aug  5 00:43:16 xxxxxxx kernel:
> [<ffffffffa6168e61>]
> __list_del_entry+0xa1/0xd0
>  will be toiugh to get upstream tested here so I am cont=inuing to
> try
> reproduce.
> 
> Has anybody seen this list corruption before

I forgot to include this which is important

Its a list corruption now in in ipoib code.

Aug  6 23:16:30 xxxxxxxxxx kernel: WARNING: CPU: 9 PID: 10865 at
lib/list_debug.c:59 __list_del_entry+0xa1/0xd0
Aug  6 23:16:30 xxxxxxxxxx kernel: list_del corruption. prev->next
should be ffff9f63a3219fc8, but was dead000000000200
Aug  6 23:16:30 xxxxxxxxxx kernel: Modules linked in: bnx2i cnic uio
ip_vs nf_conntrack ip6table_filter ip6_tables iptable_filter tcp_diag
udp_diag inet_diag unix_diag af_packet_diag netlink_diag
oracleadvm(POE) oracleoks(POE) oracleasm(O) nfsv3 rpcsec_gss_krb5 nfsv4
dns_resolver nfs fscache dm_round_robin bonding rpcrdma ib_isert
iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt
target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib rdma_ucm
ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm mlx5_ib ib_core vfat fat
xfs sb_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm
irqbypass crc32_pclmul ghash_clmulni_intel iTCO_wdt aesni_intel
ipmi_ssif lrw iTCO_vendor_support gf128mul glue_helper ablk_helper
ioatdma cryptd ipmi_si pcspkr joydev ipmi_devintf hpwdt i2c_i801 hpilo
sg lpc_ich wmi dca ipmi_msghandler
Aug  6 23:16:30 xxxxxxxxxx kernel: acpi_power_meter pcc_cpufreq shpchp
dm_multipath binfmt_misc nfsd auth_rpcgss nfs_acl lockd grace sunrpc
ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif crct10dif_generic
i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt
fb_sys_fops mlx5_core ttm mlxfw drm bnx2x tg3 devlink mdio
crct10dif_pclmul libcrc32c crct10dif_common hpsa ptp i2c_core
crc32c_intel scsi_transport_sas pps_core dm_mirror dm_region_hash
dm_log dm_mod
Aug  6 23:16:30 xxxxxxxxxx kernel: CPU: 9 PID: 10865 Comm:
kworker/u48:3 Tainted: P        W  OE  ------------   3.10.0-
862.9.1.el7.x86_64 #1
Aug  6 23:16:30 xxxxxxxxxx kernel: Hardware name: HP ProLiant DL380
Gen9/ProLiant DL380 Gen9, BIOS P89 05/21/2018
Aug  6 23:16:30 xxxxxxxxxx kernel: Workqueue: ipoib_wq ipoib_reap_neigh
[ib_ipoib]
Aug  6 23:16:30 xxxxxxxxxx kernel: Call Trace:
Aug  6 23:16:30 xxxxxxxxxx kernel: [<ffffffffa650e84e>]
dump_stack+0x19/0x1b
Aug  6 23:16:30 xxxxxxxxxx kernel: [<ffffffffa5e91e18>]
__warn+0xd8/0x100
Aug  6 23:16:30 xxxxxxxxxx kernel: [<ffffffffa5e91e9f>]
warn_slowpath_fmt+0x5f/0x80
Aug  6 23:16:30 xxxxxxxxxx kernel: [<ffffffffa5e2959e>] ?
__switch_to+0xce/0x580
Aug  6 23:16:30 xxxxxxxxxx kernel: [<ffffffffa6168e61>]
__list_del_entry+0xa1/0xd0
Aug  6 23:16:30 xxxxxxxxxx kernel: [<ffffffffc0ac2224>]
ipoib_reap_neigh+0x174/0x1a0 [ib_ipoib]
Aug  6 23:16:30 xxxxxxxxxx kernel: [<ffffffffa5eb35ef>]
process_one_work+0x17f/0x440
Aug  6 23:16:30 xxxxxxxxxx kernel: [<ffffffffa5eb4686>]
worker_thread+0x126/0x3c0
Aug  6 23:16:30 xxxxxxxxxx kernel: [<ffffffffa5eb4560>] ?
manage_workers.isra.24+0x2a0/0x2a0
Aug  6 23:16:30 xxxxxxxxxx kernel: [<ffffffffa5ebb621>]
kthread+0xd1/0xe0
Aug  6 23:16:30 xxxxxxxxxx kernel: [<ffffffffa5ebb550>] ?
insert_kthread_work+0x40/0x40
Aug  6 23:16:30 xxxxxxxxxx kernel: [<ffffffffa65205f7>]
ret_from_fork_nospec_begin+0x21/0x21
Aug  6 23:16:30 xxxxxxxxxx kernel: [<ffffffffa5ebb550>] ?
insert_kthread_work+0x40/0x40
Aug  6 23:16:30 xxxxxxxxxx kernel: ---[ end trace 020d3cfb07217423 ]---

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux