Re: [PATCH] epoll: add exclusive wakeups flag

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 03/14/2016 01:47 PM, Michael Kerrisk (man-pages) wrote:
> [Restoring CC, which I see I accidentally dropped, one iteration back.]
> 
> Hi Jason,
> 
> Thanks for the review. I've tweaked one piece to respond to your
> feedback. But I also have another new question below.
> 
> On 03/15/2016 03:55 AM, Jason Baron wrote:
>> On 03/11/2016 06:25 PM, Michael Kerrisk (man-pages) wrote:
>>> On 03/11/2016 09:51 PM, Jason Baron wrote:
>>>> On 03/11/2016 03:30 PM, Michael Kerrisk (man-pages) wrote:
> 
> [...]
> 
>> Hi Michael,
>>
>> Looks good. One comment below.
>>
>> Thanks,
>>
>>>        EPOLLEXCLUSIVE (since Linux 4.5)
>>>               Sets  an  exclusive  wakeup  mode  for  the  epoll  file
>>>               descriptor  that  is  being  attached to the target file
>>>               descriptor, fd.  When a wakeup event occurs and multiple
>>>               epoll  file  descriptors are attached to the same target
>>>               file using EPOLLEXCLUSIVE, one or more of the epoll file
>>>               descriptors  will  receive  an event with epoll_wait(2).
>>>               The default in this scenario (when EPOLLEXCLUSIVE is not
>>>               set)  is  for  all  epoll file descriptors to receive an
>>>               event.  EPOLLEXCLUSIVE is thus useful for avoiding thun‐
>>>               dering herd problems in certain scenarios.
>>>
>>>               If  the  same  file  descriptor  is  in  multiple  epoll
>>>               instances, some with the EPOLLEXCLUSIVE flag, and others
>>>               without,   then   events  will  provided  to  all  epoll
>>>               instances that did not specify  EPOLLEXCLUSIVE,  and  at
>>>               least  one  of  the  epoll  instances  that  did specify
>>>               EPOLLEXCLUSIVE.
>>>
>>>               The following values may  be  specified  in  conjunction
>>>               with EPOLLEXCLUSIVE: EPOLLIN, EPOLLOUT, EPOLLWAKEUP, and
>>>               EPOLLET.  EPOLLHUP and EPOLLERR can also  be  specified,
>>>               but  are  ignored (as usual).  Attempts to specify other
>>
>> I'm not sure 'ignored' is the right wording here. 'EPOLLHUP' and
>> 'EPOLERR' are always included in the set of events when something is
>> added as EPOLLEXCLUSIVE. This is consistent with the non-EPOLLEXCLUSIVE
>> add case. 
> 
> Yes.
> 
>> So 'EPOLLHUP' and 'EPOLERR' may be specified but will be
>> included in the set of events on an add, whether they are specified or not.
> 
> Yes. I understand your discomfort with the work "ignored", but the 
> problem was that, because it made special mention of EPOLLHUP and EPOLLERR,
> your proposed text made it sound as though EPOLLEXCLUSIVE somehow was
> special with respect to these two flags. I wanted to clarify that it is not.
> How about this:
> 
>               The following values may  be  specified  in  conjunction
>               with EPOLLEXCLUSIVE: EPOLLIN, EPOLLOUT, EPOLLWAKEUP, and
>               EPOLLET.  EPOLLHUP and EPOLLERR can also  be  specified,
>               but  this  is  not  required: as usual, these events are
>               always reported if they  occur,  regardless  of  whether
>               they are specified in events.
> ?

Yes, nothing special here with respect to EPOLLHUP and EPOLLERR. So this
looks fine to me.

> 
>>>               values in events yield an error.  EPOLLEXCLUSIVE may  be
>>>               used  only  in  an  EPOLL_CTL_ADD operation; attempts to
>>>               employ  it  with  EPOLL_CTL_MOD  yield  an  error.    If
>>>               EPOLLEXCLUSIVE has set using epoll_ctl(2), then a subse‐
>>>               quent EPOLL_CTL_MOD on the same epfd, fd pair yields  an
> b>>               error.  An epoll_ctl(2) that specifies EPOLLEXCLUSIVE in
>>>               events and specifies the target file descriptor fd as an
>>>               epoll  instance will likewise fail.  The error in all of
>>>               these cases is EINVAL.
>>>
>>>    ERRORS
>>>        EINVAL An invalid event type was specified along with  EPOLLEX‐
>>>               CLUSIVE in events.
>>>
>>>        EINVAL op was EPOLL_CTL_MOD and events included EPOLLEXCLUSIVE.
>>>
>>>        EINVAL op  was  EPOLL_CTL_MOD  and  the EPOLLEXCLUSIVE flag has
>>>               previously been applied to this epfd, fd pair.
>>>
>>>        EINVAL EPOLLEXCLUSIVE was specified in event and fd  is  refers
>>>               to an epoll instance.
> 
> Returning to the second sentence in this description:
> 
>               When a wakeup event occurs and multiple epoll file descrip‐
>               tors are attached to the same target file using EPOLLEXCLU‐
>               SIVE, one or  more  of  the  epoll  file  descriptors  will
>               receive  an  event with epoll_wait(2).
> 
> There is a point that is unclear to me: what does "target file" refer to?
> Is it an open file description (aka open file table entry) or an inode?
> I suspect the former, but it was not clear in your original text.
>

So from epoll's perspective, the wakeups are associated with a 'wait
queue'. So if the open() and subsequent EPOLL_CTL_ADD (which is done via
file->poll()) results in adding to the same 'wait queue' then we will
get 'exclusive' wakeup behavior.

So in general, I think the answer here is that its associated with the
inode (I coudn't say with 100% certainty without really looking at all
file->poll() implementations). Certainly, with the 'FIFO' example below,
the two scenarios will have the same behavior with respect to
EPOLLEXCLUSIVE.

Also, the 'non-exclusive' mode would be subject to the same question of
which wait queue is the epfd is associated with...

Thanks,

-Jason

> To make this point even clearer, here are two scenarios I'm thinking of.
> In each case, we're talking of monitoring the read end of a FIFO.
> 
> ===
> 
> Scenario 1:
> 
> We have three processes each of which
> 1. Creates an epoll instance
> 2. Opens the read end of the FIFO
> 3. Adds the read end of the FIFO to the epoll instance, specifying
>    EPOLLEXCLUSIVE
> 
> When input becomes available on the FIFO, how many processes
> get a wakeup?
> 
> ===
> 
> Scenario 3
> 
> A parent process opens the read end of a FIFO and then calls
> fork() three times to create three children. Each child then:
> 
> 1. Creates an epoll instance
> 2. Adds the read end of the FIFO to the epoll instance, specifying
> EPOLLEXCLUSIVE
> 
> When input becomes available on the FIFO, how many processes
> get a wakeup?
> 
> ===
> 
> Cheers,
> 
> Michael
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux