For TCP we can set BDI_CAP_STABLE_WRITES. For RDMA I don't think
that
is a good idea as pretty much all RDMA block drivers rely on the
DMA behavior above. The answer is to bounce buffer the data in
SoftiWARP / SoftRoCE.
We already do, see nvme_alloc_ns.
Krishna was getting the issue when testing TCP/NVMeF with -G
during connect. That enables data digest and STABLE_WRITES
I think. So to me it seems we don't get stable pages, but
pages which are touched after handover to the provider.
Non of the transports modifies the data at any point, both will
scan it to compute crc. So surely this is coming from the fs,
Krishna does this happen with xfs as well?
Yes, but rare(took ~15min to recreate), whereas with ext3/4
its almost immediate. Here is the error log for NVMe/TCP with xfs.
Thanks Krishna,
I assume that this makes the issue go away?
--
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 11e10fe1760f..cc93e1949b2c 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -889,7 +889,7 @@ static int nvme_tcp_try_send_data(struct
nvme_tcp_request *req)
flags |= MSG_MORE;
/* can't zcopy slab pages */
- if (unlikely(PageSlab(page))) {
+ if (unlikely(PageSlab(page)) || queue->data_digest) {
ret = sock_no_sendpage(queue->sock, page,
offset, len,
flags);
} else {
--