On 12/9/24 6:20 PM, Martin K. Petersen wrote:
What would be the benefit of submitting these operations concurrently?
I expect that submitting the two copy operations concurrently would result in lower latency for NVMe devices because the REQ_OP_COPY_DST
operation can be submitted without waiting for the REQ_OP_COPY_SRC result.
As I have explained, it adds substantial complexity and object lifetime issues throughout the stack. To what end?
I think the approach of embedding the ROD token in the bio payload would add complexity in the block layer. The token-based copy offload approach involves submitting at least the following commands to the SCSI device: * POPULATE TOKEN with a list identifier and source data ranges as parameters to send the source data ranges to the device. * RECEIVE ROD TOKEN INFORMATION with a list identifier as parameter to receive the ROD token. * WRITE USING TOKEN with the ROD token and the destination ranges as parameters to tell the device to start the copy operation. If the block layer would have to manage the ROD token, how would the ROD token be provided to the block layer? Bidirectional commands have been removed from the Linux kernel a while ago so the REQ_OP_COPY_IN parameter data would have to be used to pass parameters to the SCSI driver and also to pass the ROD token back to the block layer. A possible approach is to let the SCSI core allocate memory for the ROD token with kmalloc and to pass that pointer back to the block layer by writing that pointer into the REQ_OP_COPY_IN parameter data. While this can be implemented, I'm not sure that we should integrate support in the block layer for managing ROD tokens since ROD tokens are a concept that is specific to the SCSI protocol. Thanks, Bart.