Re: RDMA (smbdirect) testing

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 5/22/2022 7:06 PM, Namjae Jeon wrote:
2022-05-21 20:54 GMT+09:00, Tom Talpey <tom@xxxxxxxxxx>:

On 5/20/2022 2:12 PM, David Howells wrote:
Tom Talpey <tom@xxxxxxxxxx> wrote:

SoftROCE is a bit of a hot mess in upstream right now. It's
getting a lot of attention, but it's still pretty shaky.
If you're testing, I'd STRONGLY recommend SoftiWARP.

I'm having problems getting that working.  I'm setting the client up
with:

rdma link add siw0 type siw netdev enp6s0
mount //192.168.6.1/scratch /xfstest.scratch -o rdma,user=shares,pass=...

and then see:

CIFS: Attempting to mount \\192.168.6.1\scratch
CIFS: VFS: _smbd_get_connection:1513 warning: device max_send_sge = 6 too
small
CIFS: VFS: _smbd_get_connection:1516 Queue Pair creation may fail
CIFS: VFS: _smbd_get_connection:1519 warning: device max_recv_sge = 6 too
small
CIFS: VFS: _smbd_get_connection:1522 Queue Pair creation may fail
CIFS: VFS: _smbd_get_connection:1559 rdma_create_qp failed -22
CIFS: VFS: _smbd_get_connection:1513 warning: device max_send_sge = 6 too
small
CIFS: VFS: _smbd_get_connection:1516 Queue Pair creation may fail
CIFS: VFS: _smbd_get_connection:1519 warning: device max_recv_sge = 6 too
small
CIFS: VFS: _smbd_get_connection:1522 Queue Pair creation may fail
CIFS: VFS: _smbd_get_connection:1559 rdma_create_qp failed -22
CIFS: VFS: cifs_mount failed w/return code = -2

in dmesg.

Problem is, I don't know what to do about it:-/

It looks like the client is hardcoding 16 sge's, and has no option to
configure a smaller value, or reduce its requested number. That's bad,
because providers all have their own limits - and SIW_MAX_SGE is 6. I
thought I'd seen this working (metze?), but either the code changed or
someone built a custom version.
I also fully agree that we should provide users with the path to
configure this value.

Namjae/Long, have you used siw successfully?
No. I was able to reproduce the same problem that David reported. I
and Hyunchul will take a look. I also confirmed that RDMA work well
without any problems with soft-ROCE. Until this problem is fixed, I'd
like to say David to use soft-ROCE.

Why does the code require
16 sge's, regardless of other size limits? Normally, if the lower layer
supports fewer, the upper layer will simply reduce its operation sizes.
This should be answered by Long Li. It seems that he set the optimized
value for the NICs he used to implement RDMA in cifs.

"Optimized" is a funny choice of words. If the provider doesn't support
the value, it's not much of an optimization to insist on 16. :)

Personally, I'd try building a kernel with smbdirect.h changed to have
SMBDIRECT_MAX_SGE set to 6, and see what happens. You might have to
reduce the r/w sizes in mount, depending on any other issues this may
reveal.

Tom.



[Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux