Re: [PATCH v2 5/6] fsmonitor: add documentation for the fsmonitor extension.

Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> · Mon, 22 May 2017 19:28:12 +0200

On Mon, May 22, 2017 at 6:18 PM, Ben Peart <peartben@xxxxxxxxx> wrote:
> On 5/20/2017 8:10 AM, Ævar Arnfjörð Bjarmason wrote:
>>>
>>> +== File System Monitor cache
>>> +
>>> +  The file system monitor cache tracks files for which the
>>> query-fsmonitor
>>> +  hook has told us about changes.  The signature for this extension is
>>> +  { 'F', 'S', 'M', 'N' }.
>>> +
>>> +  The extension starts with
>>> +
>>> +  - 32-bit version number: the current supported version is 1.
>>> +
>>> +  - 64-bit time: the extension data reflects all changes through the
>>> given
>>> +       time which is stored as the seconds elapsed since midnight,
>>> January 1, 1970.
>>> +
>>> +  - 32-bit bitmap size: the size of the CE_FSMONITOR_DIRTY bitmap.
>>> +
>>> +  - An ewah bitmap, the n-th bit indicates whether the n-th index entry
>>> +    is CE_FSMONITOR_DIRTY.
>>
>>
>> We already have a uint64_t in one place in the codebase (getnanotime)
>> which uses a 64 bit time for nanosecond accuracy, and numerous
>> filesystems already support nanosecond timestamps (ext4, that new
>> Apple thingy...).
>>
>> I don't know if any of the inotify/fsmonitor APIs support that yet,
>> but it seems inevitable that that'll be added if not, in some
>> pathological cases we can have a lot of files modified in 1 second, so
>> using nanosecond accuracy means there'll be a lot less data to
>> consider in some cases.
>>
>> It does mean this'll only work until the year ~2500, but that seems
>> like an acceptable trade-off.
>>
>
> I really don't think nano-second resolution is needed in this case for a few
> reasons.
>
> The number of files that can change within a given second is limited by the
> IO throughput of the underlying device. Even assuming a very fast device and
> very small files and changes, this won't be that many files.
>
> Without this patch, git would have scanned all those files every time. With
> this patch, git will only scan those files a 2nd time that are modified in
> the same second that it did the first scan *that came before the first scan
> started* (the "lots of files modified" section in the 1 second timeline
> below).
>
> |------------------------- one second ---------------------|
> |-lots of files modified - git status - more file modified-|
>
> Yes, some duplicate status checks can be made but its still a significant
> win in any reasonable scenario. Especially when you consider that it is
> pretty unusual to do git status/add/commit calls in the middle of making
> lots of changes to files.

I understand that we get most of the wins either way.

I'm just wondering why not make it nanosecond-resolution now from the
beginning since that's where major filesystems are heading already,
and changing stuff like this can be a hassle once it's initially out
there, whereas just dividing by 10^9 for APIs that need seconds from
the beginning is easy & future-proof.

There are some uses of git where this would probably matter in practice.

E.g. consider a git-annex backed fileserver using ext4, currently
git-annex comes with its own FS watcher as a daemon invoked via `git
annex watch` to constantly add new files without killing performance
via a status/add in a loop, with this a daemon like that could just
run status/add in a loop, but on a busy repo the 1s interval size
might start to matter as you're constantly inspecting larger
intervals.

More importantly though, I can't think of any case where having it in
nanoseconds to begin with would do any harm.

> In addition, the backing file system monitor (Watchman) supports number of
> seconds since the unix epoch (unix time_t style).  This means any support of
> nano seconds by git is academic until someone provides a file system watcher
> that does support nano second granularity.

I haven't used watchman for anything non-trivial, but the
documentation for the query API you seem to be using says you can
request a {ctime,mtime}_ns field:

https://github.com/facebook/watchman/blob/master/website/_docs/cmd.query.markdown#user-content-available-fields

And isn't this the documentation for the "since" query you're using,
saying you can specify any arbitrary fs timing field, such as a _ns
time field:

https://github.com/facebook/watchman/blob/master/website/_docs/expr.since.md

?

> Finally, the fsmonitor index extension is versioned so that we can
> seamlessly upgrade to nano second resolution later if we desire.

Aside from my nanosecond bikeshedding, let's say we change the
semantics of any of this in the future: The index has the version, but
there's one argument passed to the hook without a version. Is the hook
expected to potentially be reading the version from the index to make
sense of whether (in this case) the argument is a mtime or mtime_ns?

Wouldn't this make more sense than that on top, i.e. pass the version
to the hook, untested (and probably whines about the sprintf
format...):

$ git diff -U1

diff --git a/cache.h b/cache.h
index 6eafd34fff..3c63f179f8 100644
--- a/cache.h
+++ b/cache.h
@@ -346,2 +346,3 @@ struct index_state {
        struct untracked_cache *untracked;
+       uint32_t fsmonitor_version;
        time_t fsmonitor_last_update;
diff --git a/fsmonitor.c b/fsmonitor.c
index f5a9f818e8..7236b538ac 100644
--- a/fsmonitor.c
+++ b/fsmonitor.c
@@ -60,2 +60,4 @@ int read_fsmonitor_extension(struct index_state
*istate, const void *data,
                return error("bad fsmonitor version %d", hdr_version);
+       else
+               istate->fsmonitor_version = hdr_version;

@@ -137,2 +139,3 @@ static int query_fsmonitor(time_t last_update,
struct strbuf *query_result)
        struct child_process cp = CHILD_PROCESS_INIT;
+       char version[1];
        char date[64];
@@ -143,2 +146,3 @@ static int query_fsmonitor(time_t last_update,
struct strbuf *query_result)

+       snprintf(version, sizeof(version), "%d", istate->fsmonitor_version);
        snprintf(date, sizeof(date), "%" PRIuMAX, (uintmax_t)last_update)