On Wed, Jun 27, 2012 at 4:01 AM, Cyrill Gorcunov <gorcunov@xxxxxxxxxx> wrote: > This allow us to print out eventpoll target file descriptor, > events and data, the /proc/pid/fdinfo/fd consists of > > | pos: 0 > | flags: 02 > | tfd: 5 events: 1d data: ffffffffffffffff > > This feature is CONFIG_CHECKPOINT_RESTORE only. > > Signed-off-by: Cyrill Gorcunov <gorcunov@xxxxxxxxxx> > CC: Al Viro <viro@xxxxxxxxxxxxxxxxxx> > CC: Alexey Dobriyan <adobriyan@xxxxxxxxx> > CC: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > CC: Pavel Emelyanov <xemul@xxxxxxxxxxxxx> > CC: James Bottomley <jbottomley@xxxxxxxxxxxxx> > --- > fs/eventpoll.c | 81 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 81 insertions(+) > > Index: linux-2.6.git/fs/eventpoll.c > =================================================================== > --- linux-2.6.git.orig/fs/eventpoll.c > +++ linux-2.6.git/fs/eventpoll.c > @@ -38,6 +38,8 @@ > #include <asm/io.h> > #include <asm/mman.h> > #include <linux/atomic.h> > +#include <linux/proc_fs.h> > +#include <linux/seq_file.h> > > /* > * LOCKING: > @@ -1897,6 +1899,83 @@ SYSCALL_DEFINE6(epoll_pwait, int, epfd, > return error; > } > > +#if defined(CONFIG_PROC_FS) && defined(CONFIG_CHECKPOINT_RESTORE) > + > +struct epitem_fdinfo { > + struct epoll_event ev; > + int fd; > +}; > + > +static struct epitem_fdinfo * > +seq_lookup_fdinfo(struct proc_fdinfo_extra *extra, struct eventpoll *ep, loff_t num) > +{ > + struct epitem_fdinfo *fdinfo = extra->priv; > + struct epitem *epi = NULL; > + struct rb_node *rbp; > + > + mutex_lock(&ep->mtx); > + for (rbp = rb_first(&ep->rbr); rbp; rbp = rb_next(rbp)) { > + if (num-- == 0) { > + epi = rb_entry(rbp, struct epitem, rbn); > + fdinfo->fd = epi->ffd.fd; > + fdinfo->ev = epi->event; > + break; This will be incredibly slow. epoll was designed to scale to tens of thousands of file descriptors. This algorithm is O(N^2) because each time we show a new epoll item we walk through the whole rb tree again (we're not doing a search so it isn't O(NlogN)). Also, we could miss one or more later items if one of the earlier items is removed from the epoll set in between "seq_lookup_fdinfo" calls. This isn't a problem for checkpoint because we assume the task (and everything with this eventpoll file in its fd table) is frozen. However it means the file will be worse than useless for almost any other purpose because they are unlikely to realize they need to freeze all the task(s) to get consistent data. Cheers, -Matt Helsley -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html