Re: iSER with policy based routing error

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



We are getting errors with rping too, even on the same IPv4 subnet.
All userland RDMA programs seem to be failing. It is like there is a
missing library or something. We installed "Infiniband Support" group
install, so I'm not sure what could be missing.

# rping -s -a 0.0.0.0
Segmentation fault

>From dmesg:
[Mon May 15 14:28:36 2017] rping[3289]: segfault at 18 ip
00007f0142ca9a34 sp 00007ffd7f0a8cc0 error 4 in
libibverbs.so.1.0.0[7f0142c9e000+11000]

# gdb rping core.3289
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-80.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/bin/rping...Reading symbols from
/usr/lib/debug/usr/bin/rping.debug...done.
done.
[New LWP 3289]
[New LWP 3292]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `rping -s -a 0.0.0.0'.
Program terminated with signal 11, Segmentation fault.
#0 __ibv_alloc_pd (context=0x0) at src/verbs.c:196
196 pd = context->ops.alloc_pd(context);
Missing separate debuginfos, use: debuginfo-install
libcxgb3-1.3.1-8.el7.x86_64 libcxgb4-1.3.5-3.el7.x86_64
libipathverbs-1.3-2.el7.x86_64 libmlx4-1.0.6-5.el7.x86_64
libmlx5-1.0.2-1.el7.x86_64 libmthca-1.0.6-13.el7.x86_64 libne
s-1.1.4-2.el7.x86_64 libnl3-3.2.21-10.el7.x86_64
(gdb) bt
#0 __ibv_alloc_pd (context=0x0) at src/verbs.c:196
#1 0x0000563bc5c1f5c6 in rping_setup_qp (cb=cb@entry=0x563bc61f9780,
cm_id=<optimized out>)
at examples/rping.c:519
#2 0x0000563bc5c1de5a in rping_run_server (cb=0x563bc61f9780) at
examples/rping.c:890
#3 main (argc=4, argv=0x7ffd7f0a8ee8) at examples/rping.c:1268
(gdb) f 0
#0 __ibv_alloc_pd (context=0x0) at src/verbs.c:196
196 pd = context->ops.alloc_pd(context);
(gdb) list
191
192 struct ibv_pd *__ibv_alloc_pd(struct ibv_context *context)
193 {
194 struct ibv_pd *pd;
195
196 pd = context->ops.alloc_pd(context);
197 if (pd)
198 pd->context = context;
199
200 return pd;
(gdb) p context
$1 = (struct ibv_context *) 0x0
(gdb)



# rping -c -a 192.168.0.13
cma event RDMA_CM_EVENT_REJECTED, error 28
wait for CONNECTED state 4
connect error -1
Segmentation fault

>From dmesg
[Mon May 15 14:27:24 2017] rping[3075]: segfault at 7f2386c800a8 ip
00007f2386672adf sp 00007ffd40e5df60 error 4 in
libibverbs.so.1.0.0[7f2386667000+11000]

# gdb rping core.3075
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-80.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/bin/rping...Reading symbols from
/usr/lib/debug/usr/bin/rping.debug...done.
done.
[New LWP 3075]
[New LWP 3078]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `rping -c -a 192.168.0.13'.
Program terminated with signal 11, Segmentation fault.
#0 __ibv_dereg_mr (mr=0x5624b478c740) at src/verbs.c:237
237 ret = mr->context->ops.dereg_mr(mr);
Missing separate debuginfos, use: debuginfo-install
libcxgb3-1.3.1-8.el7.x86_64 libcxgb4-1.3.5-3.el7.x86_64
libgcc-4.8.5-4.el7.x86_64 libipathverbs-1.3-2.el7.x86_64
libmlx4-1.0.6-5.el7.x86_64 libmlx5-1.0.2-1.el7.x86_64 libmthca
-1.0.6-13.el7.x86_64 libnes-1.1.4-2.el7.x86_64 libnl3-3.2.21-10.el7.x86_64
(gdb) bt
#0 __ibv_dereg_mr (mr=0x5624b478c740) at src/verbs.c:237
#1 0x00005624b2b618a7 in rping_free_buffers (cb=0x5624b4786780) at
examples/rping.c:470
#2 0x00005624b2b5fef3 in rping_run_client (cb=<optimized out>) at
examples/rping.c:1111
#3 main (argc=<optimized out>, argv=<optimized out>) at examples/rping.c:1270
(gdb) f 0
#0 __ibv_dereg_mr (mr=0x5624b478c740) at src/verbs.c:237
237 ret = mr->context->ops.dereg_mr(mr);
(gdb) list
232 {
233 int ret;
234 void *addr = mr->addr;
235 size_t length = mr->length;
236
237 ret = mr->context->ops.dereg_mr(mr);
238 if (!ret)
239 ibv_dofork_range(addr, length);
240
241 return ret;
(gdb) p *mr
$1 = {context = 0x7f2386c80070, pd = 0x5624b4789f30, addr =
0x5624b47867e8, length = 16, handle = 0,
lkey = 162608, rkey = 162608}
(gdb) p *mr->context
Cannot access memory at address 0x7f2386c80070
----------------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1


On Mon, May 15, 2017 at 4:39 AM, Sagi Grimberg <sagi@xxxxxxxxxxx> wrote:
> Hi Robert,
>
>> We are trying to leverage multiple cards/ports for iSER for
>> performance and resiliency reasons. The ports are configured with only
>> IPv6 addresses and each port is on a separate VLAN/subnet that is
>> routable to each other subnet. We are using rules with tables to set a
>> default gateway for each adapter/subnet based on the source IPv6
>> address (policy based routing). Using TCP for iSCSI, everything works
>> fine and traffic ingresses/egresses the right ports. However, when we
>> try using iSER, we get connection errors.
>>
>> May 12 13:39:27 prv-0-14-roberttest kernel: iser: iser_connect:
>> rdma_resolve_addr failed: -101
>> May 12 13:39:27 prv-0-14-roberttest iscsid: Received iferror -101:
>> Network is unreachable.
>> May 12 13:39:27 prv-0-14-roberttest iscsid: cannot make a connection
>> to 2604:3140:40:300:0:580:d0:0:3260 (-101,0)
>
>
> This looks 100% rdma_cm to me. iser is completely agnostic to address
> families and routes.
>
>> If we put a default gateway for IPv6 in the 'default' table, then iSER
>> is able to make a connection, but we can only use one port. It looks
>> as if iSER is not following the rules in the default routing table to
>> find the appropriate default gateway in a different table.
>
>
> As I said, iser relies on rdma_cm for routing decisions.
> I would suspect that all rdma_cm based protocols to be
> affected as well (nfs, nvmf).
>
> Did you check plain rping like Or suggested?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux