Re: rping segfault with 4.9.28 on CentOS 7.3

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I installed OFED 4.0-2.0.0.1 on a fresh snapshot with the stock kernel
(3.10.0-514.16.1.el7.x86_64). I'm getting a segfault on the server
side, but not on the client side. I don't see any debug packages in
the OFED package to load the symbols.

rping server:

# gdb rping core.10405
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-94.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/bin/rping...Reading symbols from
/usr/bin/rping...(no debugging symbols found)...done.
(no debugging symbols found)...done.
[New LWP 10405]
[New LWP 10408]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `rping -s'.
Program terminated with signal 11, Segmentation fault.
#0  0x00007f31883d45b4 in ibv_alloc_pd () from /usr/lib64/libibverbs.so.1
Missing separate debuginfos, use: debuginfo-install
librdmacm-utils-1.1.0mlnx-OFED.4.0.1.6.1.40200.x86_64
(gdb) bt
#0  0x00007f31883d45b4 in ibv_alloc_pd () from /usr/lib64/libibverbs.so.1
#1  0x0000000000402fe6 in rping_setup_qp.isra.7 ()
#2  0x0000000000401d04 in main ()
(gdb) list
No symbol table is loaded.  Use the "file" command.

rping client:

# rping -c -a 192.168.13.13
cma event RDMA_CM_EVENT_REJECTED, error 28
wait for CONNECTED state 4
connect error -1
----------------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1


On Tue, May 16, 2017 at 1:23 PM, Robert LeBlanc <robert@xxxxxxxxxxxxx> wrote:
> This is using ConnectX-4 LX RoCE cards, using only in-box drivers.
>
> While trying to debug some iSER issues, I'm trying to do rping between
> the two hosts, but I'm getting a segfault. Sagi suggested that there
> may be something wrong with my kernel ABI. I did a make mrproper and
> built the latest 4.9.28 kernel and installed the kernel headers.
>
> make -j 32 && sudo make modules_install && sudo make install && sudo
> make headers_install INSTALL_HDR_PATH=/usr
>
> After booting into the new kernel, I kept getting the segfaults, so I
> rebuilt the libibverbs, libibumad, librdmacm packages in case they
> aren't picking up the new kernel headers. Still no luck.
>
> Here is the server of rping with the rebuilt packages:
> # gdb rping core.22936
> GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-94.el7
> Copyright (C) 2013 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-redhat-linux-gnu".
> For bug reporting instructions, please see:
> <http://www.gnu.org/software/gdb/bugs/>...
> Reading symbols from /usr/bin/rping...Reading symbols from
> /usr/lib/debug/usr/bin/rping.debug...done.
> done.
> [New LWP 22936]
> [New LWP 22939]
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib64/libthread_db.so.1".
> Core was generated by `rping -s'.
> Program terminated with signal 11, Segmentation fault.
> #0  __ibv_alloc_pd (context=0x0) at src/verbs.c:196
> 196             pd = context->ops.alloc_pd(context);
> (gdb) bt
> #0  __ibv_alloc_pd (context=0x0) at src/verbs.c:196
> #1  0x000055f60331d5f6 in rping_setup_qp (cb=cb@entry=0x55f603d74780,
> cm_id=<optimized out>) at examples/rping.c:519
> #2  0x000055f60331be7e in rping_run_server (cb=0x55f603d74780) at
> examples/rping.c:890
> #3  main (argc=2, argv=0x7ffcd16aae88) at examples/rping.c:1268
> (gdb) f 0
> #0  __ibv_alloc_pd (context=0x0) at src/verbs.c:196
> 196             pd = context->ops.alloc_pd(context);
> (gdb) list
> 191
> 192     struct ibv_pd *__ibv_alloc_pd(struct ibv_context *context)
> 193     {
> 194             struct ibv_pd *pd;
> 195
> 196             pd = context->ops.alloc_pd(context);
> 197             if (pd)
> 198                     pd->context = context;
> 199
> 200             return pd;
> (gdb) p context
> $1 = (struct ibv_context *) 0x0
>
> Here is the rping client that does not have the rebuilt packages:
> # gdb rping core.8253
> GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-94.el7
> Copyright (C) 2013 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-redhat-linux-gnu".
> For bug reporting instructions, please see:
> <http://www.gnu.org/software/gdb/bugs/>...
> Reading symbols from /usr/bin/rping...Reading symbols from
> /usr/lib/debug/usr/bin/rping.debug...done.
> done.
> [New LWP 8253]
> [New LWP 8256]
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib64/libthread_db.so.1".
> Core was generated by `rping -c -a 192.168.13.13'.
> Program terminated with signal 11, Segmentation fault.
> #0  __ibv_dereg_mr (mr=0x560e295e93b0) at src/verbs.c:299
> 299             ret = mr->context->ops.dereg_mr(mr);
> (gdb) bt
> #0  __ibv_dereg_mr (mr=0x560e295e93b0) at src/verbs.c:299
> #1  0x0000560e293cd917 in rping_free_buffers (cb=0x560e295e5780) at
> examples/rping.c:470
> #2  0x0000560e293cbf57 in rping_run_client (cb=<optimized out>) at
> examples/rping.c:1111
> #3  main (argc=<optimized out>, argv=<optimized out>) at examples/rping.c:1270
> (gdb) f 9
> #0  0x0000000000000000 in ?? ()
> (gdb) f 0
> #0  __ibv_dereg_mr (mr=0x560e295e93b0) at src/verbs.c:299
> 299             ret = mr->context->ops.dereg_mr(mr);
> (gdb) list
> 294     {
> 295             int ret;
> 296             void *addr      = mr->addr;
> 297             size_t length   = mr->length;
> 298
> 299             ret = mr->context->ops.dereg_mr(mr);
> 300             if (!ret)
> 301                     ibv_dofork_range(addr, length);
> 302
> 303             return ret;
> (gdb) p mr
> $1 = (struct ibv_mr *) 0x560e295e93b0
> (gdb) p *mr
> $2 = {context = 0x7fd423be5090, pd = 0x560e295e9960, addr =
> 0x560e295e57e8, length = 16, handle = 0, lkey = 72829, rkey = 72829}
> (gdb) p *mr->context
> Cannot access memory at address 0x7fd423be5090
>
> Any ideas on what I'm doing wrong?
>
> Thanks,
>
> ----------------
> Robert LeBlanc
> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux