Re: broken CRCs at NVMeF target with SIW & NVMe/TCP transports

Tom Talpey <tom@xxxxxxxxxx> · Tue, 17 Mar 2020 08:26:52 -0400

On 3/17/2020 5:31 AM, Bernard Metzler wrote:
-----"Krishnamraju Eraparaju" <krishna2@xxxxxxxxxxx> wrote: -----

To: "Bernard Metzler" <BMT@xxxxxxxxxxxxxx>, sagi@xxxxxxxxxxx,
hch@xxxxxx
From: "Krishnamraju Eraparaju" <krishna2@xxxxxxxxxxx>
Date: 03/16/2020 05:20PM
Cc: linux-nvme@xxxxxxxxxxxxxxxxxxx, linux-rdma@xxxxxxxxxxxxxxx,
"Nirranjan Kirubaharan" <nirranjan@xxxxxxxxxxx>, "Potnuri Bharat
Teja" <bharat@xxxxxxxxxxx>
Subject: [EXTERNAL] broken CRCs at NVMeF target with SIW & NVMe/TCP
transports

I'm seeing broken CRCs at NVMeF target while running the below
program
at host. Here RDMA transport is SoftiWARP, but I'm also seeing the
same issue with NVMe/TCP aswell.

It appears to me that the same buffer is being rewritten by the
application/ULP before getting the completion for the previous
requests.
getting the completion for the previous requests. HW based
HW based trasports(like iw_cxgb4) are not showing this issue because
they copy/DMA and then compute the CRC on copied buffer.

Thanks Krishna!

Yes, I see those errors as well. For TCP/NVMeF, I see it if
the data digest is enabled, which is functional similar to
have CRC enabled for iWarp. This appears to be your suggested
'-G' command line switch during TCP connect.

For SoftiWarp at host side and iWarp hardware at target side,
CRC gets enabled. Then I see that problem at host side for
SEND type work requests: A page of data referenced by the
SEND gets sometimes modified by the ULP after CRC computation
and before the data gets handed over (copied) to TCP via
kernel_sendmsg(), and far before the ULP reaps a work
completion for that SEND. So the ULP sometimes touches the
buffer after passing ownership to the provider, and before
getting it back by a matching work completion.

Well, that's a plain ULP bug. It's the very definition of a
send queue work request completion that the buffer has been
accepted by the LLP. Would the ULP read a receive buffer before
getting a completion? Same thing. Would the ULP complain if
its application consumer wrote data into async i/o O_DIRECT
buffers, or while it computed a krb5i hash? Yep.

With siw and CRC switched off, this issue goes undetected,
since TCP copies the buffer at some point in time, and
only computes its TCP/IP checksum on a stable copy, or
typically even offloaded.

An excellent test, and I'd love to know what ULPs/apps you
caught with it.

Tom.

Another question is if it is possible that we are finally
placing stale data, or if closing the file recovers the
error by re-sending affected data. With my experiments,
until now I never detected broken file content after
file close.

Thanks,
Bernard.

Please share your thoughts/comments/suggestions on this.

Commands used:
--------------
#nvme connect -t tcp -G -a 102.1.1.6 -s 4420 -n nvme-ram0  ==> for
NVMe/TCP
#nvme connect -t rdma -a 102.1.1.6 -s 4420 -n nvme-ram0 ==> for
SoftiWARP
#mkfs.ext3 -F /dev/nvme0n1 (issue occuring frequency is more with
ext3
than ext4)
#mount /dev/nvme0n1 /mnt
#Then run the below program:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>

int main() {
	int i;
	char* line1 = "123";
	FILE* fp;
	while(1) {
		fp = fopen("/mnt/tmp.txt", "w");
		setvbuf(fp, NULL, _IONBF, 0);
		for (i=0; i<100000; i++)
		     if ((fwrite(line1, 1, strlen(line1), fp) !=
strlen(line1)))
			exit(1);

		if (fclose(fp) != 0)
			exit(1);
	}
return 0;
}

DMESG at NVMe/TCP Target:
[  +5.119267] nvmet_tcp: queue 2: cmd 83 pdu (6) data digest error:
recv
0xb1acaf93 expected 0xcd0b877d
[  +0.000017] nvmet: ctrl 1 fatal error occurred!

Thanks,
Krishna.