Re: [PATCHv10 0/9] write hints with nvme fdp, scsi streams

Bart Van Assche <bvanassche@xxxxxxx> · Mon, 9 Dec 2024 14:13:40 -0800

On 12/5/24 12:03 AM, Nitesh Shetty wrote:
But where do we store the read sector info before sending write.
I see 2 approaches here,
1. Should it be part of a payload along with write ?
     We did something similar in previous series which was not liked
     by Christoph and Bart.
2. Or driver should store it as part of an internal list inside
namespace/ctrl data structure ?
     As Bart pointed out, here we might need to send one more fail
     request later if copy_write fails to land in same driver.

Hi Nitesh,

Consider the following example: dm-linear is used to concatenate two
block devices. An NVMe device (LBA 0..999) and a SCSI device (LBA
1000..1999). Suppose that a copy operation is submitted to the dm-linear
device to copy LBAs 1..998 to LBAs 2..1998. If the copy operation is
submitted as two separate operations (REQ_OP_COPY_SRC and
REQ_OP_COPY_DST) then the NVMe device will receive the REQ_OP_COPY_SRC
operation and the SCSI device will receive the REQ_OP_COPY_DST
operation. The NVMe and SCSI device drivers should fail the copy 
operations after a timeout because they only received half of the copy
operation. After the timeout the block layer core can switch from
offloading to emulating a copy operation. Waiting for a timeout is
necessary because requests may be reordered.

I think this is a strong argument in favor of representing copy
operations as a single operation. This will allow stacking drivers
as dm-linear to deal in an elegant way with copy offload requests
where source and destination LBA ranges map onto different block
devices and potentially different block drivers.

Thanks,

Bart.