On 1/26/2023 10:20 AM, David Howells wrote:
Tom Talpey <tom@xxxxxxxxxx> wrote:
Do you have any logging from the softRoCE runs? I'd suspect some
kind of RDMA-specific scatter/gather overflow which might be
server-side as easily as client-side.
On client, try:
echo 0x1ff >/sys/module/cifs/parameters/smbd_logging_class
On server:
ksmbd.control -d conn
ksmbd.control -d rdma
Okay, on -rc5 without my patches, using:
# rdma link add rxe0 type rxe netdev enp6s0 # andromeda, softRoCE
# mount //192.168.6.1/test /xfstest.test -o user=shares,pass=foobar,rdma
# dd if=/dev/zero of=/xfstest.test/hello2 bs=16k count=1 oflag=direct conv=notrunc seek=2
the dd hangs. I've captured the client and server logging you requested plus
a pcap file on the server (see attached).
Note also I tried md5summing a 1MiB file and that produced a different MD5 sum
each time. I couldn't see enough data being transferred in the pcap to
indicate that that was happening.
It looks like the server is seeing transmit timeouts on its responses,
there are 7 of these in server-log.txt:
[3700697.936899] ksmbd: smb_direct: read/write error. opcode = 0, status
= transport retry counter exceeded(12)
[3700697.937043] ksmbd: Failed to send message: -107
Maybe this is a softiWARP issue?