Bart, > What if the source LBA range does not require splitting but the > destination LBA range requires splitting, e.g. because it crosses a > chunk_sectors boundary? Will the REQ_OP_COPY_IN operation succeed in > this case and the REQ_OP_COPY_OUT operation fail? Yes. I experimented with approaching splitting in an iterative fashion. And thus, if there was a split halfway through the COPY_IN I/O, we'd issue a corresponding COPY_OUT up to the split point and hope that the write subsequently didn't need a split. And then deal with the next segment. However, given that copy offload offers diminishing returns for small I/Os, it was not worth the hassle for the devices I used for development. It was cleaner and faster to just fall back to regular read/write when a split was required. > Does this mean that a third operation is needed to cancel > REQ_OP_COPY_IN operations if the REQ_OP_COPY_OUT operation fails? No. The device times out the token. > Additionally, how to handle bugs in REQ_OP_COPY_* submitters where a > large number of REQ_OP_COPY_IN operations is submitted without > corresponding REQ_OP_COPY_OUT operation? Is perhaps a mechanism > required to discard unmatched REQ_OP_COPY_IN operations after a > certain time? See above. For your EXTENDED COPY use case there is no token and thus the COPY_IN completes immediately. And for the token case, if you populate a million tokens and don't use them before they time out, it sounds like your submitting code is badly broken. But it doesn't matter because there are no I/Os in flight and thus nothing to discard. > Hmm ... we may each have a different opinion about whether or not the > COPY_IN/COPY_OUT semantics are a requirement for token-based copy > offloading. Maybe. But you'll have a hard time convincing me to add any kind of state machine or bio matching magic to the SCSI stack when the simplest solution is to treat copying like a read followed by a write. There is no concurrency, no kernel state, no dependency between two commands, nor two scsi_disk/scsi_device object lifetimes to manage. > Additionally, I'm not convinced that implementing COPY_IN/COPY_OUT for > ODX devices is that simple. The COPY_IN and COPY_OUT operations have > to be translated into three SCSI commands, isn't it? I'm referring to > the POPULATE TOKEN, RECEIVE ROD TOKEN INFORMATION and WRITE USING > TOKEN commands. What is your opinion about how to translate the two > block layer operations into these three SCSI commands? COPY_IN is translated to a NOP for devices implementing EXTENDED COPY and a POPULATE TOKEN for devices using tokens. COPY_OUT is translated to an EXTENDED COPY (or NVMe Copy) for devices using a single command approach and WRITE USING TOKEN for devices using tokens. There is no need for RECEIVE ROD TOKEN INFORMATION. I am not aware of UFS devices using the token-based approach. And for EXTENDED COPY there is only a single command sent to the device. If you want to do power management while that command is being processed, please deal with that in UFS. The block layer doesn't deal with the async variants of any of the other SCSI commands either... -- Martin K. Petersen Oracle Linux Engineering