Re: [PATCH 7/8] io_uring: support readv/writev with fixed buffers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 17/11/2020 22:59, Bijan Mottahedeh wrote:
> 
>>> Support readv/writev with fixed buffers, and introduce IOSQE_FIXED_BUFFER,
>>> consistent with fixed files.
>>
>> I don't like it at all, see issues below. The actual implementation would
>> be much uglier.
>>
>> I propose you to split the series and push separately. Your first 6 patches
>> first, I don't have conceptual objections to them. Then registration sharing
>> (I still need to look it up). And then we can return to this, if you're not
>> yet convinced.
> 
> Ok.  The sharing patch is actually the highest priority for us so it'd be great to know if you think it's in the right direction.
> 
> Should I submit them as they are or address your fixed_file_ref comments in Patch 4/8 as well?  Would I need your prep patch beforehand?

Ok, there are 2 new patches in 5.10, you may wait for Jens to propagate it to
5.11 or just cherry-pick (no conflicts expected) them. On top apply ("io_uring:
share fixed_file_refs b/w multiple rsrcs") to which I CC'ed you. It's
simple enough so shouldn't be much problems with it.

With that you need to call io_set_resource_node() every time you take
a resource. That's it, _no_ extra ref_put for you to add in puts/free/etc.

Also, consider that all ref_nodes of all types should be hooked into a
single ->ref_list (e.g. file_data->ref_list).

> 
>>> +static ssize_t io_import_iovec_fixed(int rw, struct io_kiocb *req, void *buf,
>>> +                     unsigned segs, unsigned fast_segs,
>>> +                     struct iovec **iovec,
>>> +                     struct iov_iter *iter)
>>> +{
>>> +    struct io_ring_ctx *ctx = req->ctx;
>>> +    struct io_mapped_ubuf *imu;
>>> +    struct iovec *iov;
>>> +    u16 index, buf_index;
>>> +    ssize_t ret;
>>> +    unsigned long seg;
>>> +
>>> +    if (unlikely(!ctx->buf_data))
>>> +        return -EFAULT;
>>> +
>>> +    ret = import_iovec(rw, buf, segs, fast_segs, iovec, iter);
>>
>> Did you test it? import_iovec() does access_ok() against each iov_base,
>> which in your case are an index.
> 
> I used liburing:test/file-{register,update} as models for the equivalent buffer tests and they seem to work.  I can send out the tests and the liburing changes if you want.
> 
> The fixed io test registers an empty iov table first:
> 
>     ret = io_uring_register_buffers(ring, iovs, UIO_MAXIOV);
> 
> It next updates the table with two actual buffers at offset 768:
> 
>         ret = io_uring_register_buffers_update(ring, 768, ups, 2);
> 
> It later uses the buffer at index 768 for writev similar to the file-register test (IOSQE_FIXED_BUFFER instead of IOSQE_FIXED_FILE):
> 
>         iovs[768].iov_base = (void *)768;
>         iovs[768].iov_len = pagesize;
> 
> 
>         io_uring_prep_writev(sqe, fd, iovs, 1, 0);
>         sqe->flags |= IOSQE_FIXED_BUFFER;
> 
>         ret = io_uring_submit(ring);
> 
> Below is the iovec returned from
> 
> io_import_iovec_fixed()
> -> io_import_vec()
> 
> {iov_base = 0x300 <dm_early_create+412>, iov_len = 4096}
> 
>>> +    if (ret < 0)
>>> +        return ret;
>>> +
>>> +    iov = (struct iovec *)iter->iov;
>>> +
>>> +    for (seg = 0; seg < iter->nr_segs; seg++) {
>>> +        buf_index = *(u16 *)(&iov[seg].iov_base);
>>
>> That's ugly, and also not consistent with rw_fixed, because iov_base is
>> used to calculate offset.
> 
> Would offset be applicable when using readv/writev?
> 
> My thinkig was that for those cases, each iovec should be used exactly as registered.
> 
>>
>>> +        if (unlikely(buf_index < 0 || buf_index >= ctx->nr_user_bufs))
>>> +            return -EFAULT;
>>> +
>>> +        index = array_index_nospec(buf_index, ctx->nr_user_bufs);
>>> +        imu = io_buf_from_index(ctx, index);
>>> +        if (!imu->ubuf || !imu->len)
>>> +            return -EFAULT;
>>> +        if (iov[seg].iov_len > imu->len)
>>> +            return -EFAULT;
>>> +
>>> +        iov[seg].iov_base = (void *)imu->ubuf;
>>
>> Nope, that's not different from non registered version.
>> What import_fixed actually do is setting up the iter argument to point
>> to a bvec (a vector of struct page *).
> 
> So in fact, the buffers end up being pinned again because they are not being as bvecs?
> 
>>
>> So it either would need to keep a vector of bvecs, that's a vector of vectors,
>> that's not supported by iter, etc., so you'll also need to iterate over them
>> in io_read/write and so on. Or flat 2D structure into 1D, but that's still ugly.
> 
> So you're saying there's no clean way to create a readv/writev + fixed buffers API?  It would've been nice to have a consistent API between files and buffers.
> 
> 
>>> @@ -5692,7 +5743,7 @@ static int io_timeout_remove_prep(struct io_kiocb *req,
>>>   {
>>>       if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
>>>           return -EINVAL;
>>> -    if (unlikely(req->flags & (REQ_F_FIXED_FILE | REQ_F_BUFFER_SELECT)))
>>> +    if (unlikely(req->flags & (REQ_F_FIXED_RSRC | REQ_F_BUFFER_SELECT)))
>>
>> Why it's here?
>>
>> #define REQ_F_FIXED_RSRC    (REQ_F_FIXED_FILE | REQ_F_FIXED_BUFFER)
>> So, why do you | with REQ_F_BUFFER_SELECT again here?
> 
> I thought to group fixed files/buffers but distinguish them from selected buffers.
> 
>>> @@ -87,6 +88,8 @@ enum {
>>>   #define IOSQE_ASYNC        (1U << IOSQE_ASYNC_BIT)
>>>   /* select buffer from sqe->buf_group */
>>>   #define IOSQE_BUFFER_SELECT    (1U << IOSQE_BUFFER_SELECT_BIT)
>>> +/* use fixed buffer set */
>>> +#define IOSQE_FIXED_BUFFER    (1U << IOSQE_FIXED_BUFFER_BIT)
>>
>> Unfortenatuly, we're almost out of flags bits -- it's a 1 byte
>> field and 6 bits are already taken. Let's not use it.

-- 
Pavel Begunkov



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux