On Fri, Dec 11, 2020 at 11:35 PM Keith Busch <kbusch@xxxxxxxxxx> wrote: > > On Fri, Dec 11, 2020 at 07:21:38PM +0530, SelvaKumar S wrote: > > +int blk_copy_emulate(struct block_device *bdev, struct blk_copy_payload *payload, > > + gfp_t gfp_mask) > > +{ > > + struct request_queue *q = bdev_get_queue(bdev); > > + struct bio *bio; > > + void *buf = NULL; > > + int i, nr_srcs, max_range_len, ret, cur_dest, cur_size; > > + > > + nr_srcs = payload->copy_range; > > + max_range_len = q->limits.max_copy_range_sectors << SECTOR_SHIFT; > > The default value for this limit is 0, and this is the function for when > the device doesn't support copy. Are we expecting drivers to set this > value to something else for that case? Sorry. Missed that. Will add a fix. > > > + cur_dest = payload->dest; > > + buf = kvmalloc(max_range_len, GFP_ATOMIC); > > + if (!buf) > > + return -ENOMEM; > > + > > + for (i = 0; i < nr_srcs; i++) { > > + bio = bio_alloc(gfp_mask, 1); > > + bio->bi_iter.bi_sector = payload->range[i].src; > > + bio->bi_opf = REQ_OP_READ; > > + bio_set_dev(bio, bdev); > > + > > + cur_size = payload->range[i].len << SECTOR_SHIFT; > > + ret = bio_add_page(bio, virt_to_page(buf), cur_size, > > + offset_in_page(payload)); > > 'buf' is vmalloc'ed, so we don't necessarily have congituous pages. I > think you need to allocate the bio with bio_map_kern() or something like > that instead with that kind of memory. > Sure. Will use bio_map_kern(). > > + if (ret != cur_size) { > > + ret = -ENOMEM; > > + goto out; > > + } > > + > > + ret = submit_bio_wait(bio); > > + bio_put(bio); > > + if (ret) > > + goto out; > > + > > + bio = bio_alloc(gfp_mask, 1); > > + bio_set_dev(bio, bdev); > > + bio->bi_opf = REQ_OP_WRITE; > > + bio->bi_iter.bi_sector = cur_dest; > > + ret = bio_add_page(bio, virt_to_page(buf), cur_size, > > + offset_in_page(payload)); > > + if (ret != cur_size) { > > + ret = -ENOMEM; > > + goto out; > > + } > > + > > + ret = submit_bio_wait(bio); > > + bio_put(bio); > > + if (ret) > > + goto out; > > + > > + cur_dest += payload->range[i].len; > > + } > > I think this would be a faster implementation if the reads were > asynchronous with a payload buffer allocated specific to that read, and > the callback can enqueue the write part. This would allow you to > accumulate all the read data and write it in a single call. Sounds like a better approach. Will add this implementation in v4.