This is using ConnectX-4 LX RoCE cards, using only in-box drivers. While trying to debug some iSER issues, I'm trying to do rping between the two hosts, but I'm getting a segfault. Sagi suggested that there may be something wrong with my kernel ABI. I did a make mrproper and built the latest 4.9.28 kernel and installed the kernel headers. make -j 32 && sudo make modules_install && sudo make install && sudo make headers_install INSTALL_HDR_PATH=/usr After booting into the new kernel, I kept getting the segfaults, so I rebuilt the libibverbs, libibumad, librdmacm packages in case they aren't picking up the new kernel headers. Still no luck. Here is the server of rping with the rebuilt packages: # gdb rping core.22936 GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-94.el7 Copyright (C) 2013 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu". For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>... Reading symbols from /usr/bin/rping...Reading symbols from /usr/lib/debug/usr/bin/rping.debug...done. done. [New LWP 22936] [New LWP 22939] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". Core was generated by `rping -s'. Program terminated with signal 11, Segmentation fault. #0 __ibv_alloc_pd (context=0x0) at src/verbs.c:196 196 pd = context->ops.alloc_pd(context); (gdb) bt #0 __ibv_alloc_pd (context=0x0) at src/verbs.c:196 #1 0x000055f60331d5f6 in rping_setup_qp (cb=cb@entry=0x55f603d74780, cm_id=<optimized out>) at examples/rping.c:519 #2 0x000055f60331be7e in rping_run_server (cb=0x55f603d74780) at examples/rping.c:890 #3 main (argc=2, argv=0x7ffcd16aae88) at examples/rping.c:1268 (gdb) f 0 #0 __ibv_alloc_pd (context=0x0) at src/verbs.c:196 196 pd = context->ops.alloc_pd(context); (gdb) list 191 192 struct ibv_pd *__ibv_alloc_pd(struct ibv_context *context) 193 { 194 struct ibv_pd *pd; 195 196 pd = context->ops.alloc_pd(context); 197 if (pd) 198 pd->context = context; 199 200 return pd; (gdb) p context $1 = (struct ibv_context *) 0x0 Here is the rping client that does not have the rebuilt packages: # gdb rping core.8253 GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-94.el7 Copyright (C) 2013 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu". For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>... Reading symbols from /usr/bin/rping...Reading symbols from /usr/lib/debug/usr/bin/rping.debug...done. done. [New LWP 8253] [New LWP 8256] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". Core was generated by `rping -c -a 192.168.13.13'. Program terminated with signal 11, Segmentation fault. #0 __ibv_dereg_mr (mr=0x560e295e93b0) at src/verbs.c:299 299 ret = mr->context->ops.dereg_mr(mr); (gdb) bt #0 __ibv_dereg_mr (mr=0x560e295e93b0) at src/verbs.c:299 #1 0x0000560e293cd917 in rping_free_buffers (cb=0x560e295e5780) at examples/rping.c:470 #2 0x0000560e293cbf57 in rping_run_client (cb=<optimized out>) at examples/rping.c:1111 #3 main (argc=<optimized out>, argv=<optimized out>) at examples/rping.c:1270 (gdb) f 9 #0 0x0000000000000000 in ?? () (gdb) f 0 #0 __ibv_dereg_mr (mr=0x560e295e93b0) at src/verbs.c:299 299 ret = mr->context->ops.dereg_mr(mr); (gdb) list 294 { 295 int ret; 296 void *addr = mr->addr; 297 size_t length = mr->length; 298 299 ret = mr->context->ops.dereg_mr(mr); 300 if (!ret) 301 ibv_dofork_range(addr, length); 302 303 return ret; (gdb) p mr $1 = (struct ibv_mr *) 0x560e295e93b0 (gdb) p *mr $2 = {context = 0x7fd423be5090, pd = 0x560e295e9960, addr = 0x560e295e57e8, length = 16, handle = 0, lkey = 72829, rkey = 72829} (gdb) p *mr->context Cannot access memory at address 0x7fd423be5090 Any ideas on what I'm doing wrong? Thanks, ---------------- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html