On Thu, Jul 01 2021, Jeff Hostetler via GitGitGadget wrote: > From: Jeff Hostetler <jeffhost@xxxxxxxxxxxxx> > > Teach fsmonitor--daemon client threads to create a cookie file > inside the .git directory and then wait until FS events for the > cookie are observed by the FS listener thread. > > This helps address the racy nature of file system events by > blocking the client response until the kernel has drained any > event backlog. > > This is especially important on MacOS where kernel events are > only issued with a limited frequency. See the `latency` argument > of `FSeventStreamCreate()`. The kernel only signals every `latency` > seconds, but does not guarantee that the kernel queue is completely > drained, so we may have to wait more than one interval. If we > increase the frequency, the system is more likely to drop events. > We avoid these issues by having each client thread create a unique > cookie file and then wait until it is seen in the event stream. Is this a guaranteed property of any API fsmonitor might need to work with (linux, darwin, Windows) that if I perform a bunch of FS operations on my working tree, that if I finish up by touching this cookie file that that'll happen last? I'd think that wouldn't be the case, i.e. on POSIX filesystems unless you run around fsyncing both files and directories you're not guaranteed that they're on disk, and even then the kernel might decide to sync your cookie earlier, won't it? E.g. on Linux you can even have cross-FS watches, and mix & match different FS types. I'd expect to get events in whatever implementation-defined order the VFS layer + FS decided to sync them to disk in & get to firing off an event for me. Or do these APIs all guarantee that a linear view of the world is presented to the API consumer? > Co-authored-by: Kevin Willford <Kevin.Willford@xxxxxxxxxxxxx> > Co-authored-by: Johannes Schindelin <johannes.schindelin@xxxxxx> > Signed-off-by: Jeff Hostetler <jeffhost@xxxxxxxxxxxxx> > --- > builtin/fsmonitor--daemon.c | 228 +++++++++++++++++++++++++++++++++++- > fsmonitor--daemon.h | 5 + > 2 files changed, 232 insertions(+), 1 deletion(-) > > diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c > index 8249420ba18..25f18f2726b 100644 > --- a/builtin/fsmonitor--daemon.c > +++ b/builtin/fsmonitor--daemon.c > @@ -94,6 +94,149 @@ static int do_as_client__status(void) > } > } > > +enum fsmonitor_cookie_item_result { > + FCIR_ERROR = -1, /* could not create cookie file ? */ > + FCIR_INIT = 0, > + FCIR_SEEN, > + FCIR_ABORT, > +}; > + > +struct fsmonitor_cookie_item { > + struct hashmap_entry entry; > + const char *name; > + enum fsmonitor_cookie_item_result result; > +}; > + > +static int cookies_cmp(const void *data, const struct hashmap_entry *he1, > + const struct hashmap_entry *he2, const void *keydata) > +{ > + const struct fsmonitor_cookie_item *a = > + container_of(he1, const struct fsmonitor_cookie_item, entry); > + const struct fsmonitor_cookie_item *b = > + container_of(he2, const struct fsmonitor_cookie_item, entry); Re earlier comments about verbose names, these are just enums in a builtin/*.c file, so a name like "cookie_item" is fine, and then the whole thing might even fit on one line.. :) > [...] > + /* > + * We will write filesystem syncing cookie files into > + * <gitdir>/<fsmonitor-dir>/<cookie-dir>/<pid>-<seq>. > + * > + * The extra layers of subdirectories here keep us from > + * changing the mtime on ".git/" or ".git/foo/" when we create > + * or delete cookie files. > + * > + * There have been problems with some IDEs that do a > + * non-recursive watch of the ".git/" directory and run a > + * series of commands any time something happens. > + * > + * For example, if we place our cookie files directly in > + * ".git/" or ".git/foo/" then a `git status` (or similar > + * command) from the IDE will cause a cookie file to be > + * created in one of those dirs. This causes the mtime of > + * those dirs to change. This triggers the IDE's watch > + * notification. This triggers the IDE to run those commands > + * again. And the process repeats and the machine never goes > + * idle. > + * > + * Adding the extra layers of subdirectories prevents the > + * mtime of ".git/" and ".git/foo" from changing when a > + * cookie file is created. > + */ > + strbuf_init(&state.path_cookie_prefix, 0); > + strbuf_addbuf(&state.path_cookie_prefix, &state.path_gitdir_watch); > + > + strbuf_addch(&state.path_cookie_prefix, '/'); > + strbuf_addstr(&state.path_cookie_prefix, FSMONITOR_DIR); > + mkdir(state.path_cookie_prefix.buf, 0777); > + > + strbuf_addch(&state.path_cookie_prefix, '/'); > + strbuf_addstr(&state.path_cookie_prefix, FSMONITOR_COOKIE_DIR); > + mkdir(state.path_cookie_prefix.buf, 0777); > + > + strbuf_addch(&state.path_cookie_prefix, '/'); So, on some stupid IDEs (would be nice to have specifics in the commit message, which ones/versions?) this avoids causing infinite activity, but on slightly more industrious stupid IDEs that would do their own recursive watch we'll have the same problem? Perhaps we should just consider creating it at the top-level and those IDEs will eventually sort out their bugs, sooner than later if this feature ships... > + strbuf_release(&state.path_cookie_prefix); > + > + /* > + * NEEDSWORK: Consider "rm -rf <gitdir>/<fsmonitor-dir>" > + */ That would also make this trivial, presumably it's a "needswork" since you have this recursive structure, but if it's at the top-level we already did the unlink() above, so NEEDSWORK solved then?