I installed OFED 4.0-2.0.0.1 on a fresh snapshot with the stock kernel (3.10.0-514.16.1.el7.x86_64). I'm getting a segfault on the server side, but not on the client side. I don't see any debug packages in the OFED package to load the symbols. rping server: # gdb rping core.10405 GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-94.el7 Copyright (C) 2013 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu". For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>... Reading symbols from /usr/bin/rping...Reading symbols from /usr/bin/rping...(no debugging symbols found)...done. (no debugging symbols found)...done. [New LWP 10405] [New LWP 10408] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". Core was generated by `rping -s'. Program terminated with signal 11, Segmentation fault. #0 0x00007f31883d45b4 in ibv_alloc_pd () from /usr/lib64/libibverbs.so.1 Missing separate debuginfos, use: debuginfo-install librdmacm-utils-1.1.0mlnx-OFED.4.0.1.6.1.40200.x86_64 (gdb) bt #0 0x00007f31883d45b4 in ibv_alloc_pd () from /usr/lib64/libibverbs.so.1 #1 0x0000000000402fe6 in rping_setup_qp.isra.7 () #2 0x0000000000401d04 in main () (gdb) list No symbol table is loaded. Use the "file" command. rping client: # rping -c -a 192.168.13.13 cma event RDMA_CM_EVENT_REJECTED, error 28 wait for CONNECTED state 4 connect error -1 ---------------- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Tue, May 16, 2017 at 1:23 PM, Robert LeBlanc <robert@xxxxxxxxxxxxx> wrote: > This is using ConnectX-4 LX RoCE cards, using only in-box drivers. > > While trying to debug some iSER issues, I'm trying to do rping between > the two hosts, but I'm getting a segfault. Sagi suggested that there > may be something wrong with my kernel ABI. I did a make mrproper and > built the latest 4.9.28 kernel and installed the kernel headers. > > make -j 32 && sudo make modules_install && sudo make install && sudo > make headers_install INSTALL_HDR_PATH=/usr > > After booting into the new kernel, I kept getting the segfaults, so I > rebuilt the libibverbs, libibumad, librdmacm packages in case they > aren't picking up the new kernel headers. Still no luck. > > Here is the server of rping with the rebuilt packages: > # gdb rping core.22936 > GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-94.el7 > Copyright (C) 2013 Free Software Foundation, Inc. > License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> > This is free software: you are free to change and redistribute it. > There is NO WARRANTY, to the extent permitted by law. Type "show copying" > and "show warranty" for details. > This GDB was configured as "x86_64-redhat-linux-gnu". > For bug reporting instructions, please see: > <http://www.gnu.org/software/gdb/bugs/>... > Reading symbols from /usr/bin/rping...Reading symbols from > /usr/lib/debug/usr/bin/rping.debug...done. > done. > [New LWP 22936] > [New LWP 22939] > [Thread debugging using libthread_db enabled] > Using host libthread_db library "/lib64/libthread_db.so.1". > Core was generated by `rping -s'. > Program terminated with signal 11, Segmentation fault. > #0 __ibv_alloc_pd (context=0x0) at src/verbs.c:196 > 196 pd = context->ops.alloc_pd(context); > (gdb) bt > #0 __ibv_alloc_pd (context=0x0) at src/verbs.c:196 > #1 0x000055f60331d5f6 in rping_setup_qp (cb=cb@entry=0x55f603d74780, > cm_id=<optimized out>) at examples/rping.c:519 > #2 0x000055f60331be7e in rping_run_server (cb=0x55f603d74780) at > examples/rping.c:890 > #3 main (argc=2, argv=0x7ffcd16aae88) at examples/rping.c:1268 > (gdb) f 0 > #0 __ibv_alloc_pd (context=0x0) at src/verbs.c:196 > 196 pd = context->ops.alloc_pd(context); > (gdb) list > 191 > 192 struct ibv_pd *__ibv_alloc_pd(struct ibv_context *context) > 193 { > 194 struct ibv_pd *pd; > 195 > 196 pd = context->ops.alloc_pd(context); > 197 if (pd) > 198 pd->context = context; > 199 > 200 return pd; > (gdb) p context > $1 = (struct ibv_context *) 0x0 > > Here is the rping client that does not have the rebuilt packages: > # gdb rping core.8253 > GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-94.el7 > Copyright (C) 2013 Free Software Foundation, Inc. > License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> > This is free software: you are free to change and redistribute it. > There is NO WARRANTY, to the extent permitted by law. Type "show copying" > and "show warranty" for details. > This GDB was configured as "x86_64-redhat-linux-gnu". > For bug reporting instructions, please see: > <http://www.gnu.org/software/gdb/bugs/>... > Reading symbols from /usr/bin/rping...Reading symbols from > /usr/lib/debug/usr/bin/rping.debug...done. > done. > [New LWP 8253] > [New LWP 8256] > [Thread debugging using libthread_db enabled] > Using host libthread_db library "/lib64/libthread_db.so.1". > Core was generated by `rping -c -a 192.168.13.13'. > Program terminated with signal 11, Segmentation fault. > #0 __ibv_dereg_mr (mr=0x560e295e93b0) at src/verbs.c:299 > 299 ret = mr->context->ops.dereg_mr(mr); > (gdb) bt > #0 __ibv_dereg_mr (mr=0x560e295e93b0) at src/verbs.c:299 > #1 0x0000560e293cd917 in rping_free_buffers (cb=0x560e295e5780) at > examples/rping.c:470 > #2 0x0000560e293cbf57 in rping_run_client (cb=<optimized out>) at > examples/rping.c:1111 > #3 main (argc=<optimized out>, argv=<optimized out>) at examples/rping.c:1270 > (gdb) f 9 > #0 0x0000000000000000 in ?? () > (gdb) f 0 > #0 __ibv_dereg_mr (mr=0x560e295e93b0) at src/verbs.c:299 > 299 ret = mr->context->ops.dereg_mr(mr); > (gdb) list > 294 { > 295 int ret; > 296 void *addr = mr->addr; > 297 size_t length = mr->length; > 298 > 299 ret = mr->context->ops.dereg_mr(mr); > 300 if (!ret) > 301 ibv_dofork_range(addr, length); > 302 > 303 return ret; > (gdb) p mr > $1 = (struct ibv_mr *) 0x560e295e93b0 > (gdb) p *mr > $2 = {context = 0x7fd423be5090, pd = 0x560e295e9960, addr = > 0x560e295e57e8, length = 16, handle = 0, lkey = 72829, rkey = 72829} > (gdb) p *mr->context > Cannot access memory at address 0x7fd423be5090 > > Any ideas on what I'm doing wrong? > > Thanks, > > ---------------- > Robert LeBlanc > PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html