On Fri, Apr 12, 2019 at 6:42 PM Liu, Changcheng <changcheng.liu@xxxxxxxxx> wrote: > > Hi all, > I'm enabling Ceph/RDMA(iWARP) in Ceph/V14.2.0. > It always hit segmentation fault at querying rdma devices after quering radma devices succesffully for several times. > > I traced the living kernel and found the problem in function ib_uverbs_write: > 1. ib_safe_file_access(filp) is false, then ib_uverbs_write return -EACCESS. > 2. filp->f_cred == current_cred() is false, then ib_safe_file_access return false. > > Could anyone give some suggestion to further check that filp->f_cred is not equal to current_cred? Hi Haodong, do you happen to know why Changcheng has segfault here ? > > Below is the kernel code and traced log. > file: drivers/infiniband/core/uverbs_main.c > 712 static ssize_t ib_uverbs_write(struct file *filp, const char __user *buf, > 713 size_t count, loff_t *pos) > 714 { > 715 +---- 9 lines: struct ib_uverbs_file *file = filp->private_data;----- > 724 if (!ib_safe_file_access(filp)) { > 725 pr_err_once("uverbs_write: process %d (%s) changed security contexts after opening file descriptor, this is not allowed.\n", > 726 task_tgid_vnr(current), current->comm); > 727 return -EACCES; > 728 } > 729 +--- 74 lines: if (count < sizeof(hdr))------------------------------- > 803 } > > file: kernel/include/rdma/ib.h > 91 /* > 92 * The IB interfaces that use write() as bi-directional ioctl() are > 93 * fundamentally unsafe, since there are lots of ways to trigger "write()" > 94 * calls from various contexts with elevated privileges. That includes the > 95 * traditional suid executable error message writes, but also various kernel > 96 * interfaces that can write to file descriptors. > 97 * > 98 * This function provides protection for the legacy API by restricting the > 99 * calling context. > 100 */ > 101 static inline bool ib_safe_file_access(struct file *filp) > 102 { > 103 return filp->f_cred == current_cred() && !uaccess_kernel(); > 104 } > > Kernel trace log: > root@nstcloudcc1:/sys/kernel/debug/tracing# cat /sys/kernel/debug/tracing/trace > # tracer: nop > # > # _-----=> irqs-off > # / _----=> need-resched > # | / _---=> hardirq/softirq > # || / _--=> preempt-depth > # ||| / delay > # TASK-PID CPU# |||| TIMESTAMP FUNCTION > # | | | |||| | | > <...>-87018 [003] .... 15409.847504: rdma_verb_fs: (ib_uverbs_write+0x3c/0x3d0 [ib_uverbs]) filp_f_cred=0xffff8906bd855b00 current_cred=0xffff8906ad773500 > get_fs=0xffffffffffffffff > <...>-87018 [003] d... 15409.847510: rdma_ib_verb: (__vfs_write+0x1b/0x40 <- ib_uverbs_write) ret=0xfffffffffffffff3 t_name="msgr-worker-0" > > B.R. > Changcheng -- Regards Kefu Chai