Re: [PATCH 15/17] io_uring/uring_cmd: defer SQE copying until we need it

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 3/25/24 6:41 AM, Anuj gupta wrote:
> On Thu, Mar 21, 2024 at 4:28?AM Jens Axboe <axboe@xxxxxxxxx> wrote:
>>
>> The previous commit turned on async data for uring_cmd, and did the
>> basic conversion of setting everything up on the prep side. However, for
>> a lot of use cases, we'll get -EIOCBQUEUED on issue, which means we do
>> not need a persistent big SQE copied.
>>
>> Unless we're going async immediately, defer copying the double SQE until
>> we know we have to.
>>
>> This greatly reduces the overhead of such commands, as evidenced by
>> a perf diff from before and after this change:
>>
>>     10.60%     -8.58%  [kernel.vmlinux]  [k] io_uring_cmd_prep
>>
>> where the prep side drops from 10.60% to ~2%, which is more expected.
>> Performance also rises from ~113M IOPS to ~122M IOPS, bringing us back
>> to where it was before the async command prep.
>>
>> Signed-off-by: Jens Axboe <axboe@xxxxxxxxx>
>>
>> ~# Last command done (1 command done):
>> ---
>>  io_uring/uring_cmd.c | 25 +++++++++++++++++++------
>>  1 file changed, 19 insertions(+), 6 deletions(-)
>>
>> diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c
>> index 9bd0ba87553f..92346b5d9f5b 100644
>> --- a/io_uring/uring_cmd.c
>> +++ b/io_uring/uring_cmd.c
>> @@ -182,12 +182,18 @@ static int io_uring_cmd_prep_setup(struct io_kiocb *req,
>>         struct uring_cache *cache;
>>
>>         cache = io_uring_async_get(req);
>> -       if (cache) {
>> -               memcpy(cache->sqes, sqe, uring_sqe_size(req->ctx));
>> -               ioucmd->sqe = req->async_data;
>> +       if (unlikely(!cache))
>> +               return -ENOMEM;
>> +
>> +       if (!(req->flags & REQ_F_FORCE_ASYNC)) {
>> +               /* defer memcpy until we need it */
>> +               ioucmd->sqe = sqe;
>>                 return 0;
>>         }
>> -       return -ENOMEM;
>> +
>> +       memcpy(req->async_data, sqe, uring_sqe_size(req->ctx));
>> +       ioucmd->sqe = req->async_data;
>> +       return 0;
>>  }
>>
>>  int io_uring_cmd_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
>> @@ -245,8 +251,15 @@ int io_uring_cmd(struct io_kiocb *req, unsigned int issue_flags)
>>         }
>>
>>         ret = file->f_op->uring_cmd(ioucmd, issue_flags);
>> -       if (ret == -EAGAIN || ret == -EIOCBQUEUED)
>> -               return ret;
>> +       if (ret == -EAGAIN) {
>> +               struct uring_cache *cache = req->async_data;
>> +
>> +               if (ioucmd->sqe != (void *) cache)
>> +                       memcpy(cache, ioucmd->sqe, uring_sqe_size(req->ctx));
>> +               return -EAGAIN;
>> +       } else if (ret == -EIOCBQUEUED) {
>> +               return -EIOCBQUEUED;
>> +       }
>>
>>         if (ret < 0)
>>                 req_set_fail(req);
>> --
>> 2.43.0
>>
>>
> 
> The io_uring_cmd plumbing part of this series looks good to me.
> I tested it with io_uring nvme-passthrough on my setup with two
> optanes and there is no drop in performance as well [1].
> For this and the previous patch,
> 
> Tested-by: Anuj Gupta <anuj20.g@xxxxxxxxxxx>
> Reviewed-by: Anuj Gupta <anuj20.g@xxxxxxxxxxx>

Thanks for reviewing and testing, will add.

-- 
Jens Axboe





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux