RE: NULL ptr dereference in rpcrdma_regbuf_is_mapped

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> From: Chuck Lever [mailto:chuck.lever@xxxxxxxxxx]
> Sent: Tuesday, January 30, 2018 7:28 PM
> To: Kalderon, Michal <Michal.Kalderon@xxxxxxxxxx>
> Cc: linux-rdma@xxxxxxxxxxxxxxx
> Subject: Re: NULL ptr dereference in rpcrdma_regbuf_is_mapped
> 
> 
> 
> > On Jan 30, 2018, at 11:53 AM, Kalderon, Michal
> <Michal.Kalderon@xxxxxxxxxx> wrote:
> >
> >> From: Chuck Lever [mailto:chuck.lever@xxxxxxxxxx]
> >> Sent: Tuesday, January 30, 2018 6:47 PM
> >> To: Kalderon, Michal <Michal.Kalderon@xxxxxxxxxx>
> >> Cc: linux-rdma@xxxxxxxxxxxxxxx
> >> Subject: Re: NULL ptr dereference in rpcrdma_regbuf_is_mapped
> >>
> >>
> >>
> >>> On Jan 30, 2018, at 11:43 AM, Kalderon, Michal
> >> <Michal.Kalderon@xxxxxxxxxx> wrote:
> >>>
> >>> Hi Chuck,
> >>>
> >>> Different issue, so started different thread.
> >>> If I unload our driver while there is an open NFS connection I get a
> >>> null pointer dereference in rpcrdma_regbuf_is_mapped the pointer to
> >>> buf
> >> received in this function is NULL.
> >>
> >> Hi Michal, let's see the backtrace.
> > Sure
> 
> OK, I wonder if this is really the same problem as you reported before. Is this
> rb coming from a possibly corrupted sendctx?
> 
> So, I have a fix for the earlier bug, and I'm testing it. I'll post it later today or
> tomorrow, and let's see if this one goes away too when you try out that fix.

We're still seeing this issue with the new fixed patch. 

> 
> 
> > [root@GAD17990 ~]# [  169.085616] ib_srpt srpt_remove_one(qedr0):
> nothing to do.
> > [  169.112490] rpcrdma: removing device qedr0 for
> > 192.168.110.146:20049 [  169.143909] BUG: unable to handle kernel NULL
> > pointer dereference at 0000000000000010 [  169.181837] IP:
> > rpcrdma_dma_unmap_regbuf+0xa/0x60 [rpcrdma] [  169.209157] PGD 0
> P4D 0
> > [  169.221720] Oops: 0000 [#1] SMP [  169.237123] Modules linked in:
> > nfsv3 rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache rpcrdma ib_isert
> > iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt
> > target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm
> > ib_uverbs ib_umad rdma_cm ib_cm iw_cm 8021q garp mrp qedr(-) ib_core
> > xt_CHECKSUM iptable_mangle ipt_MASQUERADE
> nf_nat_masquerade_ipv4
> > iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4
> > xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc
> > fuse ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter
> > dm_mirror dm_region_hash dm_log dm_mod dax vfat fat intel_rapl
> > x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm
> irqbypass
> > crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel sg
> > iTCO_wdt hpilo crypto_simd iTCO_vendor_support [  169.590977]
> > i2c_i801 glue_helper hpwdt cryptd ioatdma pcspkr lpc_ich ipmi_si
> > shpchp i2c_core wmi dca pcc_cpufreq acpi_power_meter ipmi_devintf
> > mfd_core ipmi_msghandler nfsd auth_rpcgss nfs_acl lockd grace sunrpc
> > ip_tables xfs libcrc32c qede sd_mod qed tg3 ptp crc32c_intel hpsa
> > pps_core scsi_transport_sas [  169.725580] CPU: 30 PID: 2798 Comm:
> kworker/30:1H Not tainted 4.14.0-rc8+ #1 [  169.759488] Hardware name: HP
> ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 02/17/2017 [
> 169.800084] Workqueue: xprtiod xprt_autoclose [sunrpc] [  169.824940] task:
> ffff9f4b6966c380 task.stack: ffffbae6041c4000 [  169.854591] RIP:
> 0010:rpcrdma_dma_unmap_regbuf+0xa/0x60 [rpcrdma] [  169.884029] RSP:
> 0018:ffffbae6041c7dd0 EFLAGS: 00010287 [  169.910009] RAX:
> ffff9f476aa18220 RBX: ffff9f476aa18000 RCX: 0000000000000001 [  169.945042]
> RDX: 0000000000000184 RSI: 000000046c1ba100 RDI: 0000000000000000 [
> 169.980932] RBP: ffffbae6041c7dd8 R08: 0000000000000000 R09:
> 000000018020001e [  170.016169] R10: 000000006b2c6601 R11:
> ffff9f476b2c6400 R12: ffff9f4768295550 [  170.051560] R13: ffff9f4768295878
> R14: ffff9f47682958e0 R15: ffff9f47682953d0 [  170.086745] FS:
> 0000000000000000(0000) GS:ffff9f4b7f980000(0000) knlGS:0000000000000000
> [  170.126491] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [
> 170.154565] CR2: 0000000000000010 CR3: 00000005cec09005 CR4:
> 00000000001606e0 [  170.189497] Call Trace:
> > [  170.201856]  rpcrdma_ia_remove+0xca/0x110 [rpcrdma] [  170.225375]
> > xprt_rdma_close+0x70/0x90 [rpcrdma] [  170.248093]
> > xprt_autoclose+0x38/0x70 [sunrpc] [  170.269801]
> > process_one_work+0x149/0x360 [  170.290135]
> worker_thread+0x4d/0x3e0
> > [  170.308004]  kthread+0x109/0x140 [  170.323668]  ?
> > rescuer_thread+0x380/0x380 [  170.343204]  ? kthread_park+0x60/0x60 [
> > 170.360671]  ret_from_fork+0x25/0x30 [  170.378242] Code: 48 c7 c6 c0
> > e4 89 c0 48 c7 c7 70 fa 89 c0 31 c0 e8 9f 36 85 d6 e9 e5 fe ff ff 0f
> > 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 53 <48> 8b 47 10 48 89 fb
> > 48 85 c0 74 38 8b 4f 18 8b 57 08 48 8b 37 [  170.469580] RIP:
> > rpcrdma_dma_unmap_regbuf+0xa/0x60 [rpcrdma] RSP: ffffbae6041c7dd0
> [
> > 170.505477] CR2: 0000000000000010 [  170.522257] ---[ end trace
> > 0f69dc0bd121b690 ]--- [  170.546716] Kernel panic - not syncing: Fatal
> > exception [  170.572952] Kernel Offset: 0x16000000 from
> > 0xffffffff81000000 (relocation range:
> > 0xffffffff80000000-0xffffffffbfffffff)
> > [  170.628837] ---[ end Kernel panic - not syncing: Fatal exception [
> > 170.657647] ------------[ cut here ]------------ [  170.679659]
> > WARNING: CPU: 30 PID: 2798 at kernel/sched/core.c:1179
> > set_task_cpu+0x191/0x1a0 [  170.719367] Modules linked in: nfsv3
> > rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache rpcrdma ib_isert
> iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod
> ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad
> rdma_cm ib_cm iw_cm 8021q garp mrp qedr(-) ib_core xt_CHECKSUM
> iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat
> nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack
> nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc fuse ebtable_filter
> ebtables ip6table_filter ip6_tables iptable_filter dm_mirror dm_region_hash
> dm_log dm_mod dax vfat fat intel_rapl x86_pkg_temp_thermal
> intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul
> crc32_pclmul ghash_clmulni_intel pcbc aesni_intel sg iTCO_wdt hpilo
> crypto_simd iTCO_vendor_support [  171.072372]  i2c_i801 glue_helper
> hpwdt cryptd ioatdma pcspkr lpc_ich ipmi_si shpchp i2c_core wmi dca
> pcc_cpufreq acpi_power_meter ipmi_devintf mfd_core ipmi_msghandler
> nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c qede
> sd_mod qed tg3 ptp crc32c_intel hpsa pps_core scsi_transport_sas
> > [  171.204786] CPU: 30 PID: 2798 Comm: kworker/30:1H Tainted: G      D
> 4.14.0-rc8+ #1
> > [  171.245766] Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380
> > Gen9, BIOS P89 02/17/2017 [  171.286989] Workqueue: xprtiod
> > xprt_autoclose [sunrpc] [  171.311991] task: ffff9f4b6966c380
> > task.stack: ffffbae6041c4000 [  171.340762] RIP:
> > 0010:set_task_cpu+0x191/0x1a0 [  171.362691] RSP:
> > 0018:ffff9f4b7f983c38 EFLAGS: 00010046 [  171.388519] RAX:
> > 0000000000000200 RBX: ffff9f4b66652d00 RCX: 0000000000000008 [
> > 171.423127] RDX: 0000000000000001 RSI: 0000000000000008 RDI:
> > ffff9f4b66652d00 [  171.457854] RBP: ffff9f4b7f983c58 R08:
> > 00000000ff00ff00 R09: 0000000000000000 [  171.493635] R10:
> > 0000000000000005 R11: 0000000000000c6c R12: ffff9f4b666537ec [
> > 171.528287] R13: 0000000000000008 R14: 0000000000000008 R15:
> > 000000000001bb80 [  171.562311] FS:  0000000000000000(0000)
> > GS:ffff9f4b7f980000(0000) knlGS:0000000000000000 [  171.602138] CS:  0010
> DS: 0000 ES: 0000 CR0: 0000000080050033 [  171.630792] CR2:
> 0000000000000010 CR3: 00000005cec09005 CR4: 00000000001606e0 [
> 171.666624] Call Trace:
> > [  171.678579]  <IRQ>
> > [  171.688847]  try_to_wake_up+0x15d/0x440 [  171.707708]
> > default_wake_function+0x12/0x20 [  171.728711]
> > __wake_up_common+0x8a/0x160 [  171.747913]
> __wake_up_locked+0x16/0x20
> > [  171.766944]  ep_poll_callback+0xd0/0x300 [  171.786061]  ?
> > find_next_bit+0xb/0x10 [  171.804356]  __wake_up_common+0x8a/0x160
> [
> > 171.823705]  __wake_up_common_lock+0x7e/0xc0 [  171.844238]
> > __wake_up+0x13/0x20 [  171.860037]
> wake_up_klogd_work_func+0x40/0x60
> > [  171.881812]  irq_work_run_list+0x4d/0x70 [  171.900977]  ?
> > tick_sched_do_timer+0x70/0x70 [  171.921688]  irq_work_tick+0x40/0x50
> > [  171.939208]  update_process_times+0x42/0x60 [  171.959871]
> > tick_sched_handle+0x2d/0x60 [  171.979125]  tick_sched_timer+0x39/0x70
> > [  171.998012]  __hrtimer_run_queues+0xe5/0x230 [  172.019188]
> > hrtimer_interrupt+0xa8/0x1a0 [  172.038449]
> > smp_apic_timer_interrupt+0x5f/0x130
> > [  172.061660]  apic_timer_interrupt+0x9d/0xb0 [  172.082865]  </IRQ>
> > [  172.092655] RIP: 0010:panic+0x1fd/0x245 [  172.111274] RSP:
> > 0018:ffffbae6041c7b10 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff10 [
> > 172.148505] RAX: 0000000000000034 RBX: 0000000000000000 RCX:
> > 0000000000000006 [  172.183145] RDX: 0000000000000000 RSI:
> > 0000000000000092 RDI: ffff9f4b7f98e030 [  172.218508] RBP:
> > ffffbae6041c7b80 R08: 00000000fffffffe R09: 0000000000000c6d [
> > 172.253436] R10: 0000000000000005 R11: 0000000000000c6c R12:
> > ffffffff97a304c0 [  172.288829] R13: 0000000000000000 R14:
> > 0000000000000000 R15: 0000000000000046 [  172.323876]
> > oops_end+0xb8/0xd0 [  172.339074]  no_context+0x1a8/0x400 [
> > 172.355962]  __bad_area_nosemaphore+0xee/0x1d0 [  172.377507]
> > bad_area_nosemaphore+0x14/0x20 [  172.397830]
> > __do_page_fault+0x9a/0x4f0 [  172.416720]  ? __slab_free+0x9b/0x2c0 [
> > 172.434427]  do_page_fault+0x38/0x130 [  172.452423]
> > page_fault+0x22/0x30 [  172.468930] RIP:
> > 0010:rpcrdma_dma_unmap_regbuf+0xa/0x60 [rpcrdma] [  172.499075]
> RSP:
> > 0018:ffffbae6041c7dd0 EFLAGS: 00010287 [  172.524585] RAX:
> > ffff9f476aa18220 RBX: ffff9f476aa18000 RCX: 0000000000000001 [
> > 172.559900] RDX: 0000000000000184 RSI: 000000046c1ba100 RDI:
> > 0000000000000000 [  172.594323] RBP: ffffbae6041c7dd8 R08:
> > 0000000000000000 R09: 000000018020001e [  172.628972] R10:
> > 000000006b2c6601 R11: ffff9f476b2c6400 R12: ffff9f4768295550 [
> > 172.664614] R13: ffff9f4768295878 R14: ffff9f47682958e0 R15:
> > ffff9f47682953d0 [  172.699424]  rpcrdma_ia_remove+0xca/0x110
> > [rpcrdma] [  172.723165]  xprt_rdma_close+0x70/0x90 [rpcrdma] [
> > 172.745492]  xprt_autoclose+0x38/0x70 [sunrpc] [  172.766956]
> > process_one_work+0x149/0x360 [  172.786682]
> worker_thread+0x4d/0x3e0
> > [  172.804538]  kthread+0x109/0x140 [  172.820117]  ?
> > rescuer_thread+0x380/0x380 [  172.839845]  ? kthread_park+0x60/0x60 [
> > 172.857255]  ret_from_fork+0x25/0x30 [  172.874616] Code: ff 80 8b ec
> > 07 00 00 04 e9 23 ff ff ff 0f ff e9 bf fe ff ff f7 83 84 00 00 00 fd
> > ff ff ff 0f 84 c9 fe ff ff 0f ff e9 c2 fe ff ff <0f> ff e9 d1 fe ff ff
> > 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 [  172.964520] ---[ end trace
> 0f69dc0bd121b691 ]--- [  172.986715] sched: Unexpected reschedule of
> offline CPU#8!
> > [  173.013434] ------------[ cut here ]------------ [  173.036387]
> > WARNING: CPU: 30 PID: 2798 at arch/x86/kernel/smp.c:128
> > native_smp_send_reschedule+0x3c/0x40
> > [  173.084056] Modules linked in: nfsv3 rpcsec_gss_krb5 nfsv4
> > dns_resolver nfs fscache rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi
> scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp
> ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm 8021q
> garp mrp qedr(-) ib_core xt_CHECKSUM iptable_mangle ipt_MASQUERADE
> nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4
> nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun
> bridge stp llc fuse ebtable_filter ebtables ip6table_filter ip6_tables
> iptable_filter dm_mirror dm_region_hash dm_log dm_mod dax vfat fat
> intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel
> kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc
> aesni_intel sg iTCO_wdt hpilo crypto_simd iTCO_vendor_support [
> 173.436980]  i2c_i801 glue_helper hpwdt cryptd ioatdma pcspkr lpc_ich
> ipmi_si shpchp i2c_core wmi dca pcc_cpufreq acpi_power_meter
> ipmi_devintf mfd_core ipmi_msghandler nfsd auth_rpcgss nfs_acl lockd
> grace sunrpc ip_tables xfs libcrc32c qede sd_mod qed tg3 ptp crc32c_intel
> hpsa pps_core scsi_transport_sas
> > [  173.571732] CPU: 30 PID: 2798 Comm: kworker/30:1H Tainted: G      D W
> 4.14.0-rc8+ #1
> > [  173.612509] Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380
> > Gen9, BIOS P89 02/17/2017 [  173.652963] Workqueue: xprtiod
> > xprt_autoclose [sunrpc] [  173.676765] task: ffff9f4b6966c380
> > task.stack: ffffbae6041c4000 [  173.705121] RIP:
> > 0010:native_smp_send_reschedule+0x3c/0x40
> > [  173.732678] RSP: 0018:ffff9f4b7f983bc0 EFLAGS: 00010046 [
> > 173.758761] RAX: 000000000000002e RBX: 0000000000000008 RCX:
> > 0000000000000006 [  173.793600] RDX: 0000000000000000 RSI:
> > 0000000000000096 RDI: ffff9f4b7f98e030 [  173.828260] RBP:
> > ffff9f4b7f983bc0 R08: 00000000fffffffe R09: 0000000000000cb8 [
> > 173.863983] R10: 0000000000000005 R11: 0000000000000cb7 R12:
> > ffff9f4b7f61bb80 [  173.898845] R13: ffff9f4b66652d00 R14:
> > ffff9f4b7f983c78 R15: ffff9f4b7f61bb80 [  173.933362] FS:
> > 0000000000000000(0000) GS:ffff9f4b7f980000(0000)
> > knlGS:0000000000000000 [  173.973149] CS:  0010 DS: 0000 ES: 0000 CR0:
> 0000000080050033 [  174.001483] CR2: 0000000000000010 CR3:
> 00000005cec09005 CR4: 00000000001606e0 [  174.036495] Call Trace:
> > [  174.048417]  <IRQ>
> > [  174.058474]  resched_curr+0xa1/0xc0 [  174.075648]
> > check_preempt_curr+0x79/0x90 [  174.095247]
> ttwu_do_wakeup+0x1e/0x160
> > [  174.113663]  ttwu_do_activate+0x7a/0x90 [  174.132487]
> > try_to_wake_up+0x1d4/0x440 [  174.151326]
> > default_wake_function+0x12/0x20 [  174.172536]
> > __wake_up_common+0x8a/0x160 [  174.191867]
> __wake_up_locked+0x16/0x20
> > [  174.210537]  ep_poll_callback+0xd0/0x300 [  174.230062]  ?
> > find_next_bit+0xb/0x10 [  174.248307]  __wake_up_common+0x8a/0x160
> [
> > 174.267633]  __wake_up_common_lock+0x7e/0xc0 [  174.288376]
> > __wake_up+0x13/0x20 [  174.305441]
> wake_up_klogd_work_func+0x40/0x60
> > [  174.327468]  irq_work_run_list+0x4d/0x70 [  174.346526]  ?
> > tick_sched_do_timer+0x70/0x70 [  174.366836]  irq_work_tick+0x40/0x50
> > [  174.384200]  update_process_times+0x42/0x60 [  174.404774]
> > tick_sched_handle+0x2d/0x60 [  174.424463]  tick_sched_timer+0x39/0x70
> > [  174.443012]  __hrtimer_run_queues+0xe5/0x230 [  174.464378]
> > hrtimer_interrupt+0xa8/0x1a0 [  174.484072]
> > smp_apic_timer_interrupt+0x5f/0x130
> > [  174.506656]  apic_timer_interrupt+0x9d/0xb0 [  174.527140]  </IRQ>
> > [  174.537197] RIP: 0010:panic+0x1fd/0x245 [  174.556508] RSP:
> > 0018:ffffbae6041c7b10 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff10 [
> > 174.594707] RAX: 0000000000000034 RBX: 0000000000000000 RCX:
> > 0000000000000006 [  174.629573] RDX: 0000000000000000 RSI:
> > 0000000000000092 RDI: ffff9f4b7f98e030 [  174.664900] RBP:
> > ffffbae6041c7b80 R08: 00000000fffffffe R09: 0000000000000c6d [
> > 174.699889] R10: 0000000000000005 R11: 0000000000000c6c R12:
> > ffffffff97a304c0 [  174.734379] R13: 0000000000000000 R14:
> > 0000000000000000 R15: 0000000000000046 [  174.768961]
> > oops_end+0xb8/0xd0 [  174.784712]  no_context+0x1a8/0x400 [
> > 174.802290]  __bad_area_nosemaphore+0xee/0x1d0 [  174.823836]
> > bad_area_nosemaphore+0x14/0x20 [  174.844449]
> > __do_page_fault+0x9a/0x4f0 [  174.863790]  ? __slab_free+0x9b/0x2c0 [
> > 174.881658]  do_page_fault+0x38/0x130 [  174.899896]
> > page_fault+0x22/0x30 [  174.916071] RIP:
> > 0010:rpcrdma_dma_unmap_regbuf+0xa/0x60 [rpcrdma] [  174.945525]
> RSP:
> > 0018:ffffbae6041c7dd0 EFLAGS: 00010287 [  174.971363] RAX:
> > ffff9f476aa18220 RBX: ffff9f476aa18000 RCX: 0000000000000001 [
> > 175.006604] RDX: 0000000000000184 RSI: 000000046c1ba100 RDI:
> > 0000000000000000 [  175.041286] RBP: ffffbae6041c7dd8 R08:
> > 0000000000000000 R09: 000000018020001e [  175.075703] R10:
> > 000000006b2c6601 R11: ffff9f476b2c6400 R12: ffff9f4768295550 [
> > 175.109678] R13: ffff9f4768295878 R14: ffff9f47682958e0 R15:
> > ffff9f47682953d0 [  175.143959]  rpcrdma_ia_remove+0xca/0x110
> > [rpcrdma] [  175.167172]  xprt_rdma_close+0x70/0x90 [rpcrdma] [
> > 175.188870]  xprt_autoclose+0x38/0x70 [sunrpc] [  175.209659]
> > process_one_work+0x149/0x360 [  175.229048]
> worker_thread+0x4d/0x3e0
> > [  175.246899]  kthread+0x109/0x140 [  175.262894]  ?
> > rescuer_thread+0x380/0x380 [  175.282497]  ? kthread_park+0x60/0x60 [
> > 175.301315]  ret_from_fork+0x25/0x30 [  175.318548] Code: db 00 0f 92
> > c0 84 c0 74 14 48 8b 05 cf bf a9 00 be fd 00 00 00 ff 90 a0 00 00 00
> > 5d c3 89 fe 48 c7 c7 d0 74 a3 97 e8 57 7f 09 00 <0f> ff 5d c3 0f 1f 44
> > 00 00 55 48 89 e5 48 83 ec 20 65 48 8b 04 [  175.411188] ---[ end
> > trace 0f69dc0bd121b692 ]--- [  175.434569] unchecked MSR access error:
> WRMSR to 0x83f (tried to write 0x00000000000000f6) at rIP:
> 0xffffffff97064044 (native_write_msr+0x4/0x30) [  175.497555] Call Trace:
> > [  175.509756]  <IRQ>
> > [  175.519920]  ? native_apic_msr_write+0x30/0x40 [  175.541602]
> > x2apic_send_IPI_self+0x1d/0x20 [  175.562384]
> > arch_irq_work_raise+0x28/0x40 [  175.582084]
> irq_work_queue+0x6e/0x80
> > [  175.600412]  dbs_update_util_handler+0x8a/0xb0 [  175.621994]
> > task_tick_fair+0x6cb/0x7f0 [  175.640991]  scheduler_tick+0x62/0xe0 [
> > 175.659042]  ? tick_sched_do_timer+0x70/0x70 [  175.679307]
> > update_process_times+0x47/0x60 [  175.699836]
> > tick_sched_handle+0x2d/0x60 [  175.718917]  tick_sched_timer+0x39/0x70
> > [  175.737191]  __hrtimer_run_queues+0xe5/0x230 [  175.757635]
> > hrtimer_interrupt+0xa8/0x1a0 [  175.777022]
> > smp_apic_timer_interrupt+0x5f/0x130
> > [  175.799455]  apic_timer_interrupt+0x9d/0xb0 [  175.819696]  </IRQ>
> > [  175.830023] RIP: 0010:panic+0x1fd/0x245 [  175.849324] RSP:
> > 0018:ffffbae6041c7b10 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff10 [
> > 175.886349] RAX: 0000000000000034 RBX: 0000000000000000 RCX:
> > 0000000000000006 [  175.921984] RDX: 0000000000000000 RSI:
> > 0000000000000092 RDI: ffff9f4b7f98e030 [  175.957360] RBP:
> > ffffbae6041c7b80 R08: 00000000fffffffe R09: 0000000000000c6d [
> > 175.992424] R10: 0000000000000005 R11: 0000000000000c6c R12:
> > ffffffff97a304c0 [  176.027292] R13: 0000000000000000 R14:
> > 0000000000000000 R15: 0000000000000046 [  176.062516]
> > oops_end+0xb8/0xd0 [  176.078137]  no_context+0x1a8/0x400 [
> > 176.095725]  __bad_area_nosemaphore+0xee/0x1d0 [  176.117534]
> > bad_area_nosemaphore+0x14/0x20 [  176.137970]
> > __do_page_fault+0x9a/0x4f0 [  176.156564]  ? __slab_free+0x9b/0x2c0 [
> > 176.174433]  do_page_fault+0x38/0x130 [  176.192638]
> > page_fault+0x22/0x30 [  176.209165] RIP:
> > 0010:rpcrdma_dma_unmap_regbuf+0xa/0x60 [rpcrdma] [  176.239509]
> RSP:
> > 0018:ffffbae6041c7dd0 EFLAGS: 00010287 [  176.266224] RAX:
> > ffff9f476aa18220 RBX: ffff9f476aa18000 RCX: 0000000000000001 [
> > 176.302139] RDX: 0000000000000184 RSI: 000000046c1ba100 RDI:
> > 0000000000000000 [  176.339413] RBP: ffffbae6041c7dd8 R08:
> > 0000000000000000 R09: 000000018020001e [  176.375049] R10:
> > 000000006b2c6601 R11: ffff9f476b2c6400 R12: ffff9f4768295550 [
> > 176.410088] R13: ffff9f4768295878 R14: ffff9f47682958e0 R15:
> > ffff9f47682953d0 [  176.446853]  rpcrdma_ia_remove+0xca/0x110
> > [rpcrdma] [  176.471509]  xprt_rdma_close+0x70/0x90 [rpcrdma] [
> > 176.494036]  xprt_autoclose+0x38/0x70 [sunrpc] [  176.515339]
> > process_one_work+0x149/0x360 [  176.534945]
> worker_thread+0x4d/0x3e0
> > [  176.553011]  kthread+0x109/0x140 [  176.568810]  ?
> > rescuer_thread+0x380/0x380 [  176.588222]  ? kthread_park+0x60/0x60 [
> > 176.606412]  ret_from_fork+0x25/0x30 [  176.624241] sched: Unexpected
> > reschedule of offline CPU#0!
> > [  176.651713] ------------[ cut here ]------------ [  176.675084]
> > WARNING: CPU: 30 PID: 2798 at arch/x86/kernel/smp.c:128
> > native_smp_send_reschedule+0x3c/0x40
> > [  176.721703] Modules linked in: nfsv3 rpcsec_gss_krb5 nfsv4
> > dns_resolver nfs fscache rpcrdma ib_isert iscsi_target_mod ib_iser libiscsi
> scsi_transport_iscsi ib_srpt target_core_mod ib_srp scsi_transport_srp
> ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm 8021q
> garp mrp qedr(-) ib_core xt_CHECKSUM iptable_mangle ipt_MASQUERADE
> nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4
> nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun
> bridge stp llc fuse ebtable_filter ebtables ip6table_filter ip6_tables
> iptable_filter dm_mirror dm_region_hash dm_log dm_mod dax vfat fat
> intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel
> kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc
> aesni_intel sg iTCO_wdt hpilo crypto_simd iTCO_vendor_support [
> 177.076380]  i2c_i801 glue_helper hpwdt cryptd ioatdma pcspkr lpc_ich
> ipmi_si shpchp i2c_core wmi dca pcc_cpufreq acpi_power_meter
> ipmi_devintf mfd_core ipmi_msghandler nfsd auth_rpcgss nfs_acl lockd
> grace sunrpc ip_tables xfs libcrc32c qede sd_mod qed tg3 ptp crc32c_intel
> hpsa pps_core scsi_transport_sas
> > [  177.209069] CPU: 30 PID: 2798 Comm: kworker/30:1H Tainted: G      D W
> 4.14.0-rc8+ #1
> > [  177.250754] Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380
> > Gen9, BIOS P89 02/17/2017 [  177.290959] Workqueue: xprtiod
> > xprt_autoclose [sunrpc] [  177.315723] task: ffff9f4b6966c380
> > task.stack: ffffbae6041c4000 [  177.345409] RIP:
> > 0010:native_smp_send_reschedule+0x3c/0x40
> > [  177.372066] RSP: 0018:ffff9f4b7f983e60 EFLAGS: 00010046 [
> > 177.396735] RAX: 000000000000002e RBX: 0000000000000000 RCX:
> > 0000000000000000 [  177.430540] RDX: 0000000000000000 RSI:
> > ffff9f4b7f98e038 RDI: ffff9f4b7f98e038 [  177.464537] RBP:
> > ffff9f4b7f983e60 R08: 00000000fffffffe R09: 0000000000000d39 [
> > 177.499327] R10: 0000000000000005 R11: 0000000000000d38 R12:
> > 000000000000001e [  177.534782] R13: 00000000fffe07b2 R14:
> > ffff9f4b6966c380 R15: ffff9f4b7f994768 [  177.569257] FS:
> > 0000000000000000(0000) GS:ffff9f4b7f980000(0000)
> > knlGS:0000000000000000 [  177.609059] CS:  0010 DS: 0000 ES: 0000 CR0:
> 0000000080050033 [  177.637131] CR2: 0000000000000010 CR3:
> 00000005cec09005 CR4: 00000000001606e0 [  177.672422] Call Trace:
> > [  177.684361]  <IRQ>
> > [  177.693996]  trigger_load_balance+0x105/0x1f0 [  177.715180]
> > scheduler_tick+0xab/0xe0 [  177.733058]  ?
> > tick_sched_do_timer+0x70/0x70 [  177.754008]
> > update_process_times+0x47/0x60 [  177.774941]
> > tick_sched_handle+0x2d/0x60 [  177.793687]  tick_sched_timer+0x39/0x70
> > [  177.812071]  __hrtimer_run_queues+0xe5/0x230 [  177.832929]
> > hrtimer_interrupt+0xa8/0x1a0 [  177.852686]
> > smp_apic_timer_interrupt+0x5f/0x130
> > [  177.875204]  apic_timer_interrupt+0x9d/0xb0 [  177.895378]  </IRQ>
> > [  177.905441] RIP: 0010:panic+0x1fd/0x245 [  177.923668] RSP:
> > 0018:ffffbae6041c7b10 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff10 [
> > 177.960074] RAX: 0000000000000034 RBX: 0000000000000000 RCX:
> > 0000000000000006 [  177.995687] RDX: 0000000000000000 RSI:
> > 0000000000000092 RDI: ffff9f4b7f98e030 [  178.031336] RBP:
> > ffffbae6041c7b80 R08: 00000000fffffffe R09: 0000000000000c6d [
> > 178.066983] R10: 0000000000000005 R11: 0000000000000c6c R12:
> > ffffffff97a304c0 [  178.102127] R13: 0000000000000000 R14:
> > 0000000000000000 R15: 0000000000000046 [  178.136976]
> > oops_end+0xb8/0xd0 [  178.152223]  no_context+0x1a8/0x400 [
> > 178.169104]  __bad_area_nosemaphore+0xee/0x1d0 [  178.191361]
> > bad_area_nosemaphore+0x14/0x20 [  178.211956]
> > __do_page_fault+0x9a/0x4f0 [  178.231161]  ? __slab_free+0x9b/0x2c0 [
> > 178.248982]  do_page_fault+0x38/0x130 [  178.267162]
> > page_fault+0x22/0x30 [  178.283360] RIP:
> > 0010:rpcrdma_dma_unmap_regbuf+0xa/0x60 [rpcrdma] [  178.313151]
> RSP:
> > 0018:ffffbae6041c7dd0 EFLAGS: 00010287 [  178.339653] RAX:
> > ffff9f476aa18220 RBX: ffff9f476aa18000 RCX: 0000000000000001 [
> > 178.373929] RDX: 0000000000000184 RSI: 000000046c1ba100 RDI:
> > 0000000000000000 [  178.408422] RBP: ffffbae6041c7dd8 R08:
> > 0000000000000000 R09: 000000018020001e [  178.443865] R10:
> > 000000006b2c6601 R11: ffff9f476b2c6400 R12: ffff9f4768295550 [
> > 178.478532] R13: ffff9f4768295878 R14: ffff9f47682958e0 R15:
> > ffff9f47682953d0 [  178.514716]  rpcrdma_ia_remove+0xca/0x110
> > [rpcrdma] [  178.539078]  xprt_rdma_close+0x70/0x90 [rpcrdma] [
> > 178.561619]  xprt_autoclose+0x38/0x70 [sunrpc] [  178.583680]
> > process_one_work+0x149/0x360 [  178.603772]
> worker_thread+0x4d/0x3e0
> > [  178.621660]  kthread+0x109/0x140 [  178.637765]  ?
> > rescuer_thread+0x380/0x380 [  178.657000]  ? kthread_park+0x60/0x60 [
> > 178.674977]  ret_from_fork+0x25/0x30 [  178.692351] Code: db 00 0f 92
> > c0 84 c0 74 14 48 8b 05 cf bf a9 00 be fd 00 00 00 ff 90 a0 00 00 00
> > 5d c3 89 fe 48 c7 c7 d0 74 a3 97 e8 57 7f 09 00 <0f> ff 5d c3 0f 1f 44
> > 00 00 55 48 89 e5 48 83 ec 20 65 48 8b 04 [  178.784645] ---[ end
> > trace 0f69dc0bd121b693 ]---
> >
> >>
> >>> If I check buf for NULL and return false I am able to unload the
> >>> driver,
> >> though I'm not sure this is sufficient.
> >>>
> >>> diff --git a/net/sunrpc/xprtrdma/xprt_rdma.h
> >>> b/net/sunrpc/xprtrdma/xprt_rdma.h index 1342f743..73066a6 100644
> >>> --- a/net/sunrpc/xprtrdma/xprt_rdma.h
> >>> +++ b/net/sunrpc/xprtrdma/xprt_rdma.h
> >>> @@ -588,7 +588,7 @@ struct rpcrdma_regbuf
> >>> *rpcrdma_alloc_regbuf(size_t, enum dma_data_direction, static inline
> >>> bool rpcrdma_regbuf_is_mapped(struct rpcrdma_regbuf *rb) {
> >>> -       return rb->rg_device != NULL;
> >>> +       return rb && (rb->rg_device != NULL);
> >>> :
> >>>
> >>>
> >>> Will be great if you could take a look Thanks, Michal
> >>> --
> >>> To unsubscribe from this list: send the line "unsubscribe linux-rdma"
> >>> in the body of a message to majordomo@xxxxxxxxxxxxxxx More
> >> majordomo
> >>> info at  http://vger.kernel.org/majordomo-info.html
> >>
> >> --
> >> Chuck Lever
> >>
> >>
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-rdma"
> > in the body of a message to majordomo@xxxxxxxxxxxxxxx More
> majordomo
> > info at  http://vger.kernel.org/majordomo-info.html
> 
> --
> Chuck Lever
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux