Re: [rfc 5/7] fs, epoll: Add procfs fdinfo helper

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jul 19, 2012 at 07:52:41AM -0700, Matthew Helsley wrote:
> On Wed, Jun 27, 2012 at 4:01 AM, Cyrill Gorcunov <gorcunov@xxxxxxxxxx> wrote:
> > This allow us to print out eventpoll target file descriptor,
> > events and data, the /proc/pid/fdinfo/fd consists of
> >
> >  | pos: 0
> >  | flags:       02
> >  | tfd:        5 events:       1d data: ffffffffffffffff
> >
> > +#if defined(CONFIG_PROC_FS) && defined(CONFIG_CHECKPOINT_RESTORE)
> > +
> > +struct epitem_fdinfo {
> > +       struct epoll_event      ev;
> > +       int                     fd;
> > +};
> > +
> > +static struct epitem_fdinfo *
> > +seq_lookup_fdinfo(struct proc_fdinfo_extra *extra, struct eventpoll *ep, loff_t num)
> > +{
> > +       struct epitem_fdinfo *fdinfo = extra->priv;
> > +       struct epitem *epi = NULL;
> > +       struct rb_node *rbp;
> > +
> > +       mutex_lock(&ep->mtx);
> > +       for (rbp = rb_first(&ep->rbr); rbp; rbp = rb_next(rbp)) {
> > +               if (num-- == 0) {
> > +                       epi = rb_entry(rbp, struct epitem, rbn);
> > +                       fdinfo->fd = epi->ffd.fd;
> > +                       fdinfo->ev = epi->event;
> > +                       break;
> 
> This will be incredibly slow. epoll was designed to scale to tens of
> thousands of file descriptors. This algorithm is O(N^2) because each
> time we show a new epoll item we walk through the whole rb tree again
> (we're not doing a search so it isn't O(NlogN)).

Yeah, I know, it's quadratic. I'll be reworking this series to use
immediate seq-printf and print out the whole tree once the appropriate
fdinfo file get read.

> Also, we could miss one or more later items if one of the earlier
> items is removed from the epoll set in between "seq_lookup_fdinfo"
> calls. This isn't a problem for checkpoint because we assume the task
> (and everything with this eventpoll file in its fd table) is frozen.
> However it means the file will be worse than useless for almost any
> other purpose because they are unlikely to realize they need to freeze
> all the task(s) to get consistent data.

Well, a bunch of data read from proc is consistent only at moment of
reading.

	Cyrill
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux