Re: [PATCH] cifs: Fix oops due to uncleared server->smbd_conn in reconnect

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 1/26/2023 10:20 AM, David Howells wrote:
Tom Talpey <tom@xxxxxxxxxx> wrote:

Do you have any logging from the softRoCE runs? I'd suspect some
kind of RDMA-specific scatter/gather overflow which might be
server-side as easily as client-side.

On client, try:
   echo 0x1ff >/sys/module/cifs/parameters/smbd_logging_class

On server:
    ksmbd.control -d conn
    ksmbd.control -d rdma

Okay, on -rc5 without my patches, using:

# rdma link add rxe0 type rxe netdev enp6s0 # andromeda, softRoCE
# mount //192.168.6.1/test /xfstest.test -o user=shares,pass=foobar,rdma
# dd if=/dev/zero of=/xfstest.test/hello2 bs=16k count=1 oflag=direct conv=notrunc seek=2

the dd hangs.  I've captured the client and server logging you requested plus
a pcap file on the server (see attached).

Note also I tried md5summing a 1MiB file and that produced a different MD5 sum
each time.  I couldn't see enough data being transferred in the pcap to
indicate that that was happening.

It looks like the server is seeing transmit timeouts on its responses,
there are 7 of these in server-log.txt:

[3700697.936899] ksmbd: smb_direct: read/write error. opcode = 0, status = transport retry counter exceeded(12)
[3700697.937043] ksmbd: Failed to send message: -107

Maybe this is a softiWARP issue?





[Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux