Re: [PATCH] epoll: add exclusive wakeups flag

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Jason and Michael,

Hmm... I tried to play with those pipe samples bellow, but even with sleep I got that all process wakeups (maybe I miss something too), also tried with EPOLLIN.

On same bases I created sample with Posix Queues with EPOLLIN | EPOLLEXCLUSIVE and the goods news are that it works correctly.

file q.c:
==================
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <sys/epoll.h>
#include <fcntl.h>
#include <sys/wait.h>
#include <errno.h>
#include <mqueue.h>

#define errExit(msg)    do { perror(msg); exit(EXIT_FAILURE); \
                        } while (0)

#define usageErr(msg, progName) \
                        do { fprintf(stderr, "Usage: "); \
                             fprintf(stderr, msg, progName); \
                             exit(EXIT_FAILURE); } while (0)

#ifndef EPOLLEXCLUSIVE
#define EPOLLEXCLUSIVE (1 << 28)
#endif

#define MAX_SIZE 10

int
main (int argc, char *argv[])
{
  int epfd, nready;
  struct epoll_event ev, rev;
  mqd_t fd;
  struct mq_attr attr;
  char buffer[MAX_SIZE + 1];
  int cnum;

  /* initialize the queue attributes */
  attr.mq_flags = 0;
  attr.mq_maxmsg = 5;
  attr.mq_msgsize = MAX_SIZE;
  attr.mq_curmsgs = 0;

  /* cleanup for multiple runs... */
  mq_unlink ("/TESTQ");

  /* create the message queue */
  fd =
    mq_open ("/TESTQ", O_CREAT | O_RDWR | O_NONBLOCK, S_IWUSR | S_IRUSR,
	     &attr);
  if (fd == -1)
    errExit ("open");

  for (cnum = 0; cnum < 3; cnum++)
    {
      switch (fork ())
	{
	case -1:
	  errExit ("fork");

	case 0:		/* Child */
	  epfd = epoll_create (2);
	  if (epfd == -1)
	    errExit ("epoll_create");

	  ev.events = EPOLLIN | EPOLLEXCLUSIVE;
	  if (epoll_ctl (epfd, EPOLL_CTL_ADD, fd, &ev) == -1)
	    errExit ("epoll_ctl");

	  printf ("About to wait...\n");
	  nready = epoll_wait (epfd, &rev, 1, -1);
	  if (nready == -1)
	    errExit ("epoll-wait");

	  printf ("Child %d: epoll_wait() returned %d\n", cnum, nready);
	  exit (EXIT_SUCCESS);

	default:
	  break;
	}
    }
  sleep (1);
  /* send a msq to Q */
  memset (buffer, 0, MAX_SIZE);
  if (0 > mq_send (fd, buffer, MAX_SIZE, 0))
    errExit ("mq_send");
  printf ("msg sent ok...\n");

  wait (NULL);
  wait (NULL);
  wait (NULL);

  exit (EXIT_SUCCESS);
}
==================

$ gcc q.c -lrt
$ ./a.out
About to wait...
About to wait...
About to wait...
msg sent ok...
Child 2: epoll_wait() returned 1
^C
$



Best regards,
Madars


Jason Baron @ 2016-03-15 00:35 rakstīja:
Hi Michael,

On 03/14/2016 05:03 PM, Michael Kerrisk (man-pages) wrote:
Hi Jason,

On 03/15/2016 09:01 AM, Michael Kerrisk (man-pages) wrote:
Hi Jason,

On 03/15/2016 08:32 AM, Jason Baron wrote:


On 03/14/2016 01:47 PM, Michael Kerrisk (man-pages) wrote:
[Restoring CC, which I see I accidentally dropped, one iteration back.]

[...]

Returning to the second sentence in this description:

When a wakeup event occurs and multiple epoll file descrip‐ tors are attached to the same target file using EPOLLEXCLU‐ SIVE, one or more of the epoll file descriptors will
              receive  an  event with epoll_wait(2).

There is a point that is unclear to me: what does "target file" refer to? Is it an open file description (aka open file table entry) or an inode?
I suspect the former, but it was not clear in your original text.


So from epoll's perspective, the wakeups are associated with a 'wait
queue'. So if the open() and subsequent EPOLL_CTL_ADD (which is done via file->poll()) results in adding to the same 'wait queue' then we will
get 'exclusive' wakeup behavior.

So in general, I think the answer here is that its associated with the inode (I coudn't say with 100% certainty without really looking at all file->poll() implementations). Certainly, with the 'FIFO' example below,
the two scenarios will have the same behavior with respect to
EPOLLEXCLUSIVE.

So, I was actually a little surprised by this, and went away and tested this point. It appears to me that that the two scenarios described below do NOT have the same behavior with respect to EPOLLEXCLUSIVE. See below.

So, in both scenarios, *one or more* processes will get a wakeup?
(I'll try to add something to the text to clarify the detail we're
discussing.)

Also, the 'non-exclusive' mode would be subject to the same question of
which wait queue is the epfd is associated with...

I'm not sure of the point you are trying to make here?

Cheers,

Michael


To make this point even clearer, here are two scenarios I'm thinking of.
In each case, we're talking of monitoring the read end of a FIFO.

===

Scenario 1:

We have three processes each of which
1. Creates an epoll instance
2. Opens the read end of the FIFO
3. Adds the read end of the FIFO to the epoll instance, specifying
   EPOLLEXCLUSIVE

When input becomes available on the FIFO, how many processes
get a wakeup?

When I test this scenario, all three processes get a wakeup.

===

Scenario 3

A parent process opens the read end of a FIFO and then calls
fork() three times to create three children. Each child then:

1. Creates an epoll instance
2. Adds the read end of the FIFO to the epoll instance, specifying
EPOLLEXCLUSIVE

When input becomes available on the FIFO, how many processes
get a wakeup?

When I test this scenario, one process gets a wakeup.

In other words, "target file" appears to mean open file description
(aka open file table entry), not inode.

This is actually what I suspected might be the case, but now I am
puzzled. Given what I've discovered and what you suggest are the
semantics, is the implementation correct? (I suspect that it is,
but it is at odds with your statement above. My test programs are
inline below.

Cheers,

Michael


Thanks for the test cases. So in your first test case, you are exiting
immediately after the epoll_wait() returns. So this is actually causing
the next wakeup. And then the 2nd thread returns from epoll_wait() and
this causes the 3rd wakeup.

So the wakeups are actually not happening from the write directly, but
instead from the readers doing a close(). If you do some sort of sleep
after the epoll_wait() you can confirm the behavior. So I believe this
is working as expected.

Thanks,

-Jason


============

/* t_EPOLLEXCLUSIVE_multipen.c

   Licensed under GNU GPLv2 or later.
*/
#include <sys/epoll.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <sys/types.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>

#define errExit(msg)    do { perror(msg); exit(EXIT_FAILURE); \
                        } while (0)

#define usageErr(msg, progName) \
                        do { fprintf(stderr, "Usage: "); \
                             fprintf(stderr, msg, progName); \
                             exit(EXIT_FAILURE); } while (0)

#ifndef EPOLLEXCLUSIVE
#define EPOLLEXCLUSIVE (1 << 28)
#endif

int
main(int argc, char *argv[])
{
    int fd, epfd, nready;
    struct epoll_event ev, rev;

    if (argc != 2 || strcmp(argv[1], "--help") == 0)
        usageErr("%s <FIFO>n", argv[0]);

    epfd = epoll_create(2);
    if (epfd == -1)
        errExit("epoll_create");

    fd = open(argv[1], O_RDONLY);
    if (fd == -1)
        errExit("open");
    printf("Opened %s\n", argv[1]);

    ev.events = EPOLLIN | EPOLLEXCLUSIVE;
    if (epoll_ctl(epfd, EPOLL_CTL_ADD, fd, &ev) == -1)
        errExit("epoll_ctl");

    nready = epoll_wait(epfd, &rev, 1, -1);
    if (nready == -1)
        errExit("epoll-wait");
    printf("epoll_wait() returned %d\n", nready);

    exit(EXIT_SUCCESS);
}

===============

/* t_EPOLLEXCLUSIVE_fork.c

   Licensed under GNU GPLv2 or later.
*/

#include <sys/epoll.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>

#define errExit(msg)    do { perror(msg); exit(EXIT_FAILURE); \
                        } while (0)

#define usageErr(msg, progName) \
                        do { fprintf(stderr, "Usage: "); \
                             fprintf(stderr, msg, progName); \
                             exit(EXIT_FAILURE); } while (0)

#ifndef EPOLLEXCLUSIVE
#define EPOLLEXCLUSIVE (1 << 28)
#endif

int
main(int argc, char *argv[])
{
    int fd, epfd, nready;
    struct epoll_event ev, rev;
    int cnum;

    if (argc != 2 || strcmp(argv[1], "--help") == 0)
        usageErr("%s <FIFO>n", argv[0]);

    fd = open(argv[1], O_RDONLY);
    if (fd == -1)
        errExit("open");
    printf("Opened %s\n", argv[1]);

    for (cnum = 0; cnum < 3; cnum++) {
        switch (fork()) {
        case -1:
            errExit("fork");

        case 0: /* Child */
            epfd = epoll_create(2);
            if (epfd == -1)
                errExit("epoll_create");

            ev.events = EPOLLIN | EPOLLEXCLUSIVE;
            if (epoll_ctl(epfd, EPOLL_CTL_ADD, fd, &ev) == -1)
                errExit("epoll_ctl");

            nready = epoll_wait(epfd, &rev, 1, -1);
            if (nready == -1)
                errExit("epoll-wait");
printf("Child %d: epoll_wait() returned %d\n", cnum, nready);
            exit(EXIT_SUCCESS);

        default:
            break;
        }
    }

    wait(NULL);
    wait(NULL);
    wait(NULL);

    exit(EXIT_SUCCESS);
}

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux