rping segfault with 4.9.28 on CentOS 7.3

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is using ConnectX-4 LX RoCE cards, using only in-box drivers.

While trying to debug some iSER issues, I'm trying to do rping between
the two hosts, but I'm getting a segfault. Sagi suggested that there
may be something wrong with my kernel ABI. I did a make mrproper and
built the latest 4.9.28 kernel and installed the kernel headers.

make -j 32 && sudo make modules_install && sudo make install && sudo
make headers_install INSTALL_HDR_PATH=/usr

After booting into the new kernel, I kept getting the segfaults, so I
rebuilt the libibverbs, libibumad, librdmacm packages in case they
aren't picking up the new kernel headers. Still no luck.

Here is the server of rping with the rebuilt packages:
# gdb rping core.22936
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-94.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/bin/rping...Reading symbols from
/usr/lib/debug/usr/bin/rping.debug...done.
done.
[New LWP 22936]
[New LWP 22939]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `rping -s'.
Program terminated with signal 11, Segmentation fault.
#0  __ibv_alloc_pd (context=0x0) at src/verbs.c:196
196             pd = context->ops.alloc_pd(context);
(gdb) bt
#0  __ibv_alloc_pd (context=0x0) at src/verbs.c:196
#1  0x000055f60331d5f6 in rping_setup_qp (cb=cb@entry=0x55f603d74780,
cm_id=<optimized out>) at examples/rping.c:519
#2  0x000055f60331be7e in rping_run_server (cb=0x55f603d74780) at
examples/rping.c:890
#3  main (argc=2, argv=0x7ffcd16aae88) at examples/rping.c:1268
(gdb) f 0
#0  __ibv_alloc_pd (context=0x0) at src/verbs.c:196
196             pd = context->ops.alloc_pd(context);
(gdb) list
191
192     struct ibv_pd *__ibv_alloc_pd(struct ibv_context *context)
193     {
194             struct ibv_pd *pd;
195
196             pd = context->ops.alloc_pd(context);
197             if (pd)
198                     pd->context = context;
199
200             return pd;
(gdb) p context
$1 = (struct ibv_context *) 0x0

Here is the rping client that does not have the rebuilt packages:
# gdb rping core.8253
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-94.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/bin/rping...Reading symbols from
/usr/lib/debug/usr/bin/rping.debug...done.
done.
[New LWP 8253]
[New LWP 8256]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `rping -c -a 192.168.13.13'.
Program terminated with signal 11, Segmentation fault.
#0  __ibv_dereg_mr (mr=0x560e295e93b0) at src/verbs.c:299
299             ret = mr->context->ops.dereg_mr(mr);
(gdb) bt
#0  __ibv_dereg_mr (mr=0x560e295e93b0) at src/verbs.c:299
#1  0x0000560e293cd917 in rping_free_buffers (cb=0x560e295e5780) at
examples/rping.c:470
#2  0x0000560e293cbf57 in rping_run_client (cb=<optimized out>) at
examples/rping.c:1111
#3  main (argc=<optimized out>, argv=<optimized out>) at examples/rping.c:1270
(gdb) f 9
#0  0x0000000000000000 in ?? ()
(gdb) f 0
#0  __ibv_dereg_mr (mr=0x560e295e93b0) at src/verbs.c:299
299             ret = mr->context->ops.dereg_mr(mr);
(gdb) list
294     {
295             int ret;
296             void *addr      = mr->addr;
297             size_t length   = mr->length;
298
299             ret = mr->context->ops.dereg_mr(mr);
300             if (!ret)
301                     ibv_dofork_range(addr, length);
302
303             return ret;
(gdb) p mr
$1 = (struct ibv_mr *) 0x560e295e93b0
(gdb) p *mr
$2 = {context = 0x7fd423be5090, pd = 0x560e295e9960, addr =
0x560e295e57e8, length = 16, handle = 0, lkey = 72829, rkey = 72829}
(gdb) p *mr->context
Cannot access memory at address 0x7fd423be5090

Any ideas on what I'm doing wrong?

Thanks,

----------------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux