Re: Spurious instability with NFSoRDMA under moderate load

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 17.08.2021 23:51, Timo Rothenpieler wrote:
On 17.08.2021 23:08, Chuck Lever III wrote:
I tried reproducing this using your 'xfs_io -fc "copy_range
testfile" testfile.copy' reproducer, but couldn't.

A network capture shows that the client tries CLONE first. The
server reports that's not supported, so the client tries COPY.
The COPY works, and the reply shows that the COPY was synchro-
nous. Thus there's no need for a callback, and I'm not tripping
over backchannel misbehavior.

Make sure the testfile is of sufficient size. I'm not sure what the threshold is, but if it's too small, it'll just do a synchronous copy for me as well.
I'm using a 50MB file in my tests.

The export I'm using is an xfs filesystem. Did you already
report the filesystem type you're testing against? I can't
find it in the thread.

If there's a way to force an offload-style COPY, let me know.

Oh. Also I looked at what might have been pulled into the
linux-5.12.y kernel between .12 and .19, and I don't see
anything that's especially relevant to either COPY_OFFLOAD
or backchannel.

I'm observing this with both an ext4 and zfs filesystem.
Can easily test xfs as well if desired.

Re-ran the test with xfs instead of ext4, and indeed, the issue does not manifest in that case. I guess xfs is lacking some capability for server side copy to work properly.

Are you testing this on a normal network, or with RDMA? With normal tcp, I also can't observe this issue(it doesn't time out the backchannel in the first place), it only happens in RDMA mode.
I'm using Mellanox ConnectX-4 cards in IB mode for my tests.


Attachment: smime.p7s
Description: S/MIME Cryptographic Signature


[Index of Archives]     [Linux Filesystem Development]     [Linux USB Development]     [Linux Media Development]     [Video for Linux]     [Linux NILFS]     [Linux Audio Users]     [Yosemite Info]     [Linux SCSI]

  Powered by Linux