Which protocal/NIC is used in your case? IB/RoCEv1/RoCEv2/iWARP? If you have ideas about debug the problem, please let me know. I could try it in my side. --Thanks Changcheng On 13:51 Tue 16 Apr, Roman Penyaev wrote: > On 2019-04-16 12:41, Liu, Changcheng wrote: > > Hi Roman, > > This problem doesn't happen on master branch when using Mellanox > > MCX414A-BCAT ConnectX-4 NIC with msg/async/rdma(ms_type=async+rdma) > > This problem is hit on Intel X722 NIC(msg/async/rdma/iwarp). > > > > When the problem happened, ibv_query_devices has been executed > > successfully for severeral times. The problem only happen when > > trying to run ceph-osd daemon. > > > > In my side, I use "strace -f" command to trace below command which > > trigger the segmental falut: > > strace -f ${PATH_CEPH_BUILD_DIR}/bin/ceph-osd -i 2 -c > > ${PATH_CEPH_BUILD_DIR}/ceph.conf > > It doesn't show that setuid is called. > > > > Could you tell me how do you use ftrace to track the userspace call > > stack to find that setuid is called before opening rdma devices? > > That's correct, I used ftrace the same way. Also I used gdb to get > the exact backtraces, try it also. > > In my case I debugged ceph-mgr or ceph-mds, ceph-osd runs successfully > even ms_type=async+rdma is set. > > -- > Roman >