Re: [PATCH] inotify to minimize stat() calls

Robert Zeh <robert.allan.zeh@xxxxxxxxx> · Thu, 25 Apr 2013 14:37:58 -0500

On Thu, Apr 25, 2013 at 3:18 AM, Thomas Rast <trast@xxxxxxxxxxx> wrote:
>
> Robert Zeh <robert.allan.zeh@xxxxxxxxx> writes:
>
> > Here is a patch that creates a daemon that tracks file
> > state with inotify, writes it out to a file upon request,
> > and changes most of the calls to stat to use said cache.
> >
> > It has bugs, but I figured it would be smarter to see
> > if the approach was acceptable at all before spending the
> > time to root the bugs out.
>
> Thanks for tackling this; it's probably about time we got a inotify
> support :-(

> > I've implemented the communication with a file, and not a socket,
> > because I think implementing a socket is going to create
> > security issues on multiuser systems.  For example, would a
> > socket allow stat information to cross user boundaries?
>
> This ties in with an issue discussed in an earlier thread:
>
>   http://thread.gmane.org/gmane.comp.version-control.git/217817/focus=218307
>
> The conclusion there was that the default limits are set such that it is
> not feasible to run one daemon per repository (that would quickly hit
> the limits when e.g. iterating all repos in a typical android tree using
> repo).
>
> So whatever you use for communication needs to work as a global daemon.
>
> I'd just trust the SSH folks to know about security; on my system
> ssh-agent creates
>
>   /tmp/ssh-RANDOMSTRING/agent.PID
>
> where the directory has mode 0700, and the file is a unit socket with
> mode 0600.  That should make doubly sure that no other user can open the
> socket.
>
> >  filechange-cache.c   | 203
> > +++++++++++++++++++++++++++++++++++++++++++++++++++
>
> Is your MUA wrapping the patch?

Almost certainly.  I'll double check before I send off the next patch.

> > +static void watch_directory(int inotify_fd)
> > +{
> > +     char buf[PATH_MAX];
> > +
> > +     if (!getcwd(buf, sizeof(buf)))
> > +             die_errno("Unable to get current directory");
> > +
> > +     int i = 0;
> > +     struct dir_struct dir;
> > +     const char *pathspec[1] = { buf, NULL };
> > +
> > +     memset(&dir, 0, sizeof(dir));
> > +     setup_standard_excludes(&dir);
> > +
> > +     fill_directory(&dir, pathspec);
> > +     for(i = 0; i < dir.nr; i++) {
> > +             struct dir_entry *ent = dir.entries[i];
> > +             watch_file(inotify_fd, ent->name);
> > +             free(ent);
> > +     }
>
> I don't get this bit.  The lstat() are run over all files listed in the
> index.  So shouldn't your daemon watch exactly those (or rather, all
> dirnames of such files)?
I believe that fill_directory is handling watching only files in the index.
I had some problems a while back when I was only watching the
directory with some of the inotify structures coming back empty, which
is why I started watching each individual file.

> The actual directory contents are only needed to find untracked files,
> and there would be a lot of complication surrounding that, so I suggest
> saving that for later (and for now measuring the speedup with 'git
> status -uno'!).
The speed up test is a good idea.

> For example, you'd have to actually watch and re-read all .gitignore
> files, and the .git/info/exclude, and the core.excludesfile, to see if
> your notion of an ignored file became stale.
The thought in the back of my head was to simple have the daemon
restart if one of those files changed, under the assumption that a
restart wasn't that expensive, and that it would be complicated to check.

> Also, you seem to call watch_directory() only on the current(?) dir, but
> you need to recursively set up watches for all directories in the
> repository.

I'm calling fill_directory to get the list of files to watch; it appears to
be handling the recursion for me.  It also appears to be handling filtering
out all of the untracked files, etc.

> > +     while (1) {
> > +             int i = 0;
> > +             length = read(inotify_fd, buffer, sizeof(buffer));
> > +             for(i = 0; i < length; ) {
> > +                     struct inotify_event *event =
> > +                             (struct inotify_event*)(buffer+i);
> > +                     /* printf("event: %d %x %d %s\n", event->wd, event->mask,
> > +                        event->len, event->name); */
> > +                     if (request_watch_descriptor == event->wd) {
> > +                             write_stat_cache();
> > +                     } else if (root_directory_watch_descriptor
> > +                                == event->wd) {
> > +                             printf("root directory died!\n");
> > +                             exit(0);
> > +                     } else if (event->mask & IN_Q_OVERFLOW) {
> > +                             restart();
>
> Good.
>
> > +                     } else if (event->mask & IN_MODIFY) {
> > +                             if (event->len)
> > +                                     update_stat_cache(event->name);
> > +                     }
>
> So whenever a file changes, you stat() it.  That's good for simplicity
> now, but I suspect it will provide some optimization opportunities
> later.
I figured it would be a good idea to get things working, and then worry
about optimization later :-)

>
> On some design aspects, I'd want:
>
> * a toggle to run the test suite with the daemons, or without
Yeap.
> * if you go with a user-wide daemon, a way to ensure that the test-suite
>   daemon is not the same as my "real" daemon, and make sure it is killed
>   after the test runs finish
I'm assuming a command line argument that points a daemon at a port would be the
way to handle that.

> * a test that triggers IN_Q_OVERFLOW, e.g. by sending SIGSTOP and doing
>   a large repository operation
Yeap.  I think you'd want some way to verify (through a log file?)
that the overflow
happened.

> * a test that renames directories
Yeap.

> The last one is just based on my personal experience with messing with
> inotify; renaming directories is the "hard" case for that API.  We may
> already cover this in the test suite, or we may not; but it must be
> tested.
>
> Other than that last point, focus your tests not on small tests but on
> the test suite.  It would seem rather unlikely to me that you could
> manage to pass the entire test suite with this daemon active but broken.
I've had some experiences where the test suite passes with the daemon active,
but not populating the cache.
> --
> Thomas Rast
> trast@{inf,student}.ethz.ch
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html