On Thu, Oct 19, 2017 at 05:40:59PM -0400, Michael J. Ruhl wrote: > I was playing with the ibacm service and discovered an issue > the other day. > > If no provider library is present (I removed libacmp.so, and the > provider keyword in the opts.cfg file is libacmp), when a resolve > request is posted, the kernel will crash with the following Oops: > > BUG: unable to handle kernel NULL pointer dereference at (null) > IP: (null) > PGD 10543f1067 P4D 10543f1067 PUD 1033f93067 PMD 0 > Oops: 0010 [#1] SMP > Modules linked in: rpcrdma ib_isert iscsi_target_mod > target_core_mod ib_iser libiscsi scsi_transport_iscsi ib_ipoib rdma_ucm ib_u > ib_uverbs ib_umad rdma_cm ib_cm iw_cm dm_mirror dm_region_hash dm_log dm_mod > dax sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm irqbypass > crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel crypto_si > glue_helper cryptd hfi1 rdmavt iTCO_wdt iTCO_vendor_support ib_core mei_me > lpc_ich pcspkr mei ioatdma sg shpchp i2c_i801 mfd_core wmi ipmi_si ipmi_devi > ipmi_msghandler acpi_power_meter acpi_pad nfsd auth_rpcgss nfs_acl lockd gra > sunrpc ip_tables ext4 mbcache jbd2 sd_mod mgag200 drm_kms_helper syscopyarea > sysfillrect sysimgblt fb_sys_fops ttm igb ahci crc32c_intel ptp libahci > pps_core drm dca libata i2c_algo_bit i2c_core > CPU: 54 PID: 9841 Comm: ibacm Tainted: G I 4.14.0-rc2+ #6 > Hardware name: Intel Corporation S2600WT2/S2600WT2, BIOS SE5C610.86B.01.01.0008.021120151325 02/11/2015 > task: ffff880855f42d00 task.stack: ffffc900246b4000 > RIP: 0010: (null) > RSP: 0018:ffffc900246b7bc8 EFLAGS: 00010246 > RAX: ffffffff81dbe9e0 RBX: ffff881058bb1000 RCX: 0000000000000000 > RDX: 0000000000001100 RSI: ffff881058bb1320 RDI: ffff881056362000 > RBP: ffffc900246b7bf8 R08: 0000000000000ec0 R09: 0000000000001100 > R10: ffff8810573a5000 R11: 0000000000000000 R12: ffff881056362000 > R13: 0000000000000ec0 R14: ffff881058bb1320 R15: 0000000000000ec0 > FS: 00007fe0ba5a38c0(0000) GS:ffff88105f080000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 0000000000000000 CR3: 0000001056f5d003 CR4: 00000000001606e0 > Call Trace: > ? netlink_dump+0x12c/0x290 > __netlink_dump_start+0x186/0x1f0 > rdma_nl_rcv_msg+0x193/0x1b0 [ib_core] > rdma_nl_rcv+0xdc/0x130 [ib_core] > netlink_unicast+0x181/0x240 > netlink_sendmsg+0x2c2/0x3b0 > sock_sendmsg+0x38/0x50 > SYSC_sendto+0x102/0x190 > ? __audit_syscall_entry+0xaf/0x100 > ? syscall_trace_enter+0x1d0/0x2b0 > ? __audit_syscall_exit+0x209/0x290 > SyS_sendto+0xe/0x10 > do_syscall_64+0x67/0x1b0 > entry_SYSCALL64_slow_path+0x25/0x25 > RIP: 0033:0x7fe0b9db2a63 > RSP: 002b:00007ffc55edc260 EFLAGS: 00000293 ORIG_RAX: 000000000000002c > RAX: ffffffffffffffda RBX: 0000000000000010 RCX: 00007fe0b9db2a63 > RDX: 0000000000000010 RSI: 00007ffc55edc280 RDI: 000000000000000d > RBP: 00007ffc55edc670 R08: 00007ffc55edc270 R09: 000000000000000c > R10: 0000000000000000 R11: 0000000000000293 R12: 00007ffc55edc280 > R13: 000000000260b400 R14: 000000000000000d R15: 0000000000000001 > Code: Bad RIP value. > RIP: (null) RSP: ffffc900246b7bc8 > CR2: 0000000000000000 > ---[ end trace 8d67abcfd10ec209 ]--- > Kernel panic - not syncing: Fatal exception > Kernel Offset: disabled > ---[ end Kernel panic - not syncing: Fatal exception > ------------[ cut here ]------------ > > The issue is that in rdma_nl_rcv_msg(), the check > 'if (flags & NLM_F_DUMP)' is not completely correct. > > NLM_F_DUMP is two bits NLM_F_ROOT | NLM_F_MATCH. > > ibacm sends a RDMA_NL_LS response with the RDMA_NL_LS_F_ERR bit set > if an error occurs in the service (like no provider being available, > or ACM_STATUS_ENODATA, etc.). > > NLM_F_ROOT == (0x100) == RDMA_NL_LS_F_ERR. > > The current code thinks that it sees a NLM_F_DUMP flag and incorrectly calls > the .dump() callback. Hi Michael, Thanks for the report and for excellent analysis, You are right that RDMA_NL_LS_F_ERR has the same value as NLM_F_ROOT and it is bad, but I just think that it is not the final root cause. In case of errors, the LS was supposed to send NLMSG_ERROR message and not overload general nlmsg_flags, which is awful. However I don't know if it is feasible to fix current implementation without breaking UAPI contract. In meanwhile, can we implement dummy dumpit functions for the LS, which reuse ib_nl_is_good_ip_resp? I prefer this solution over yours, because it doesn't mix LS-specifics with general decision function and leaves LS anomalies in the LS-relevant code. And returning 0 in absence of dumpit function as a response with NLM_F_DUMP flag is wrong. User should be aware of the fact that something wrong was with his request. Thanks > > The included patch is an atempt to fix this issue. This patch fixes the > issue that I am seeing, but I am not sure how to test the messages for > RDMA_NL_RDMA_CM or RDMA_NL_IWCM (or any message that uses the > NLM_F_DUMP bits). > > If anyone has some knowledge of these services, any extra testing would > be welcomed. > > If the patch has no issues or comments, I will formally re-submit it > (through my usual channel Denny). > > Thanks, > > Mike > > > --- > > Michael J. Ruhl (1): > RDMA/netlink: OOPs in rdma_nl_rcv_msg() from misinterpreted flag > > > drivers/infiniband/core/netlink.c | 7 +++++-- > 1 files changed, 5 insertions(+), 2 deletions(-) > > -- > > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo@xxxxxxxxxxxxxxx > More majordomo info at http://vger.kernel.org/majordomo-info.html
Attachment:
signature.asc
Description: PGP signature