Re: [PATCH v3 1/2] io_uring: change the poll events to be 32-bits

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 2020/6/15 下午11:09, Jens Axboe wrote:
> On 6/14/20 8:49 PM, Jiufei Xue wrote:
>> Hi Jens,
>>
>> On 2020/6/13 上午12:48, Jens Axboe wrote:
>>> On 6/12/20 8:58 AM, Jens Axboe wrote:
>>>> On 6/11/20 8:30 PM, Jiufei Xue wrote:
>>>>> poll events should be 32-bits to cover EPOLLEXCLUSIVE.
>>>>>
>>>>> Signed-off-by: Jiufei Xue <jiufei.xue@xxxxxxxxxxxxxxxxx>
>>>>> ---
>>>>>  fs/io_uring.c                 | 4 ++--
>>>>>  include/uapi/linux/io_uring.h | 2 +-
>>>>>  tools/io_uring/liburing.h     | 2 +-
>>>>>  3 files changed, 4 insertions(+), 4 deletions(-)
>>>>>
>>>>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>>>>> index 47790a2..6250227 100644
>>>>> --- a/fs/io_uring.c
>>>>> +++ b/fs/io_uring.c
>>>>> @@ -4602,7 +4602,7 @@ static void io_poll_queue_proc(struct file *file, struct wait_queue_head *head,
>>>>>  static int io_poll_add_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
>>>>>  {
>>>>>  	struct io_poll_iocb *poll = &req->poll;
>>>>> -	u16 events;
>>>>> +	u32 events;
>>>>>  
>>>>>  	if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
>>>>>  		return -EINVAL;
>>>>> @@ -8196,7 +8196,7 @@ static int __init io_uring_init(void)
>>>>>  	BUILD_BUG_SQE_ELEM(28, /* compat */   int, rw_flags);
>>>>>  	BUILD_BUG_SQE_ELEM(28, /* compat */ __u32, rw_flags);
>>>>>  	BUILD_BUG_SQE_ELEM(28, __u32,  fsync_flags);
>>>>> -	BUILD_BUG_SQE_ELEM(28, __u16,  poll_events);
>>>>> +	BUILD_BUG_SQE_ELEM(28, __u32,  poll_events);
>>>>>  	BUILD_BUG_SQE_ELEM(28, __u32,  sync_range_flags);
>>>>>  	BUILD_BUG_SQE_ELEM(28, __u32,  msg_flags);
>>>>>  	BUILD_BUG_SQE_ELEM(28, __u32,  timeout_flags);
>>>>> diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
>>>>> index 92c2269..afc7edd 100644
>>>>> --- a/include/uapi/linux/io_uring.h
>>>>> +++ b/include/uapi/linux/io_uring.h
>>>>> @@ -31,7 +31,7 @@ struct io_uring_sqe {
>>>>>  	union {
>>>>>  		__kernel_rwf_t	rw_flags;
>>>>>  		__u32		fsync_flags;
>>>>> -		__u16		poll_events;
>>>>> +		__u32		poll_events;
>>>>>  		__u32		sync_range_flags;
>>>>>  		__u32		msg_flags;
>>>>>  		__u32		timeout_flags;
>>>>
>>>> We obviously have the space in there as most other flag members are 32-bits, but
>>>> I'd want to double check if we're not changing the ABI here. Is this always
>>>> going to be safe, on any platform, regardless of endianess etc?
>>>
>>> Double checked, and as I feared, we can't safely do this. We'll have to
>>> do something like the below, grabbing an unused bit of the poll mask
>>> space and if that's set, then store the fact that EPOLLEXCLUSIVE is set.
>>> So probably best to turn this just into one patch, since it doesn't make
>>> a lot of sense to do it as a prep patch at that point.
>>>
>> Yes, Agree about that. But I also fear that if the unused bit is used
>> in the feature, it will bring unexpected behavior.
> 
> Yeah, it's certainly not the prettiest and could potentially be fragile.
> I'm open to suggestions, we need some way of signaling that the 32-bit
> variant of the poll_events should be used. We could potentially make
> this work by doing explicit layout for big endian vs little endian, that
> might be prettier and wouldn't suffer from the "grab some random bit"
> issue.
> 
Thank you for your suggestion, I will think about it.

>>> This does have the benefit of not growing io_poll_iocb. With your patch,
>>> it'd go beyond a cacheline, and hence bump the size of the entire
>>> io_iocb as well, which would be very unfortunate.
>>>
>> events in io_poll_iocb is 32-bits already, so why it will bump the
>> size of the io_iocb structure with my patch? 
> 
> It's not 32-bits already, it's a __poll_t type which is 16-bits only.
> 
Yes, it is a __poll_t type, but I found that __poll_t type is 32-bits
with the definition below:

typedef unsigned __bitwise __poll_t;

And I also investigate it with crash:
crash> io_poll_iocb -ox
struct io_poll_iocb {
   [0x0] struct file *file;
         union {
   [0x8]     struct wait_queue_head *head;
   [0x8]     u64 addr;
         };
  [0x10] __poll_t events;
  [0x14] bool done;
  [0x15] bool canceled;
  [0x18] struct wait_queue_entry wait;
}



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux