git operations are slow on repositories with lots of files, and lots of tiny filesystem calls like lstat(), getdents(), open() are reposible for this. On the linux-2.6 repository, for instance, the numbers for "git status" look like this: top syscalls sorted top syscalls sorted by acc. time by number ---------------------------------------------- 0.401906 40950 lstat 0.401906 40950 lstat 0.190484 5343 getdents 0.150055 5374 open 0.150055 5374 open 0.190484 5343 getdents 0.074843 2806 close 0.074843 2806 close 0.003216 157 read 0.003216 157 read To solve this problem, we propose to build a daemon which will watch the filesystem using inotify and report batched up events over a UNIX socket. Since inotify is Linux-only, we have to leave open the possibility of writing similar daemons for other platforms. Everything will continue to work as before if there is no helper present. The fswatch API introduces a generic way for git.git to request for filesystem changes. Different helpers (like the inotify daemon on Linux) will be plugged into this API on different platforms. It falls back to using the filesystem calls. The daemon will start up with the very first operation done on the git repository, and will die after a specified period of repository inactivity. It is going to be a per-repo daemon and will write to a socket in the repository: access control is managed by filesystem permissions. This design is inspired by the credential helper design. Signed-off-by: Ramkumar Ramachandra <artagnon@xxxxxxxxx> --- Documentation/technical/api-fswatch.txt | 62 +++++++++++++++++++++++++++++++++ 1 file changed, 62 insertions(+) create mode 100644 Documentation/technical/api-fswatch.txt diff --git a/Documentation/technical/api-fswatch.txt b/Documentation/technical/api-fswatch.txt new file mode 100644 index 0000000..9c6826a --- /dev/null +++ b/Documentation/technical/api-fswatch.txt @@ -0,0 +1,62 @@ +fswatch API +=========== + +The fswatch API provides an abstracted way of collecting information +about filesystem changes. A remote helper is typically a daemon which +uses inotify to watch the filesystem, and this information is used by +git instead of making expensive system calls like lstat(), open(). + +Typical setup +------------- + +------------ ++-----------------------+ +| Git code (C) |--- requires information about fs changes +|.......................| +| C fswatch API |--- system calls ---> filesystem ++-----------------------+ + ^ | + | UNIX socket | + | v ++-----------------------+ +| Git fswatch helper |--- daemon inotify-watching ---> filesystem ++-----------------------+ +------------ + +The Git code will call the C API to obtain changes in filesystem +information. The API will itself call a configured helper (e.g. "git +fswatch-notify") which may run filesystem changes, if the remote +helper daemon was started in a previous invocation. If the daemon is +not already running, it is started, and the C API will fall back to +making expensive system calls. + +C API +----- + +The credential C API is meant to be called by Git code which needs +information aboutx filesystem changes. It is centered around an +object representing the changes the filesystem since the last +invocation. + +Data Structures +~~~~~~~~~~~~~~~ + +`struct fschanges`:: + + TODO + + +Functions +~~~~~~~~~ + +TODO + +Example +~~~~~~~ + +TODO + +fswatch Helpers +--------------- + +TODO -- 1.8.1.5 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html