On 2024/06/25 1:25, Bart Van Assche wrote: > On 6/24/24 3:44 AM, Nitesh Shetty wrote: >> For reference, I have listed the approaches we have taken in the past. >> >> a. Token/payload based approach: >> 1. Here we allocate a buffer/payload. >> 2. First source BIO is sent along with the buffer. >> 3. Once the buffer reaches driver, it is filled with the source LBA >> and length and namespace info. And the request is completed. >> 4. Then destination BIO is sent with same buffer. >> 5. Once the buffer reaches driver, it retrieves the source information from >> the BIO and forms a copy command and sends it down to device. >> >> We received feedback that putting anything inside payload which is not >> data, is not a good idea[1]. > > A token-based approach (pairing copy_src and copy_dst based on a token) > is completely different from a payload-based approach (copy offload > parameters stored in the bio payload). From [1] (I agree with what has > been quoted): "In general every time we tried to come up with a request > payload that is not just data passed to the device it has been a > nightmare." [ ... ] "The only thing we'd need is a sequence number / idr > / etc to find an input and output side match up, as long as we > stick to the proper namespace scope." > >> c. List/ctx based approach: >> A new member is added to bio, bio_copy_ctx, which will a union with >> bi_integrity. Idea is once a copy bio reaches blk_mq_submit_bio, it will >> add the bio to this list. >> 1. Send the destination BIO, once this reaches blk_mq_submit_bio, this >> will add the destination BIO to the list inside bi_copy_ctx and return >> without forming any request. >> 2. Send source BIO, once this reaches blk_mq_submit_bio, this will >> retrieve the destination BIO from bi_copy_ctx and form a request with >> destination BIO and source BIO. After this request will be sent to >> driver. >> >> This work is still in POC phase[2]. But this approach makes lifetime >> management of BIO complicated, especially during failure cases. > > Associating src and dst operations by embedding a pointer to a third > data structure in struct bio is an implementation choice and is not the > only possibility for assocating src and dst operations. Hence, the > bio lifetime complexity mentioned above is not inherent to the list > based approach but is a result of the implementation choice made for > associating src and dst operations. > > Has it been considered to combine the list-based approach for managing > unpaired copy operations with the token based approach for pairing copy > src and copy dst operations? I am still a little confused as to why we need 2 BIOs, one for src and one for dst... Is it because of the overly complex scsi extended copy support ? Given that the main use case is copy offload for data within the same device, using a single BIO which somehow can carry a list of LBA sources and a single destination LBA would be far simpler and perfectly matching nvme simple copy and ATA write gathered. And I think that this would also match the simplest case for scsi extended copy as well. -- Damien Le Moal Western Digital Research