Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an experimental built-in FSMonitor #3082

Merged
merged 44 commits into from Mar 8, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
21f4ba1
Merge remote-tracking branch 'ggg/jh/fsmonitor-prework' into fsmonitor4
jeffhostetler Feb 22, 2021
e194809
fsmonitor: fix memory corruption in some corner cases
dscho Mar 2, 2021
2b4dd0c
fsmonitor: do not forget to release the token in `discard_index()`
dscho Mar 2, 2021
ebf106b
pkt-line: eliminate the need for static buffer in packet_write_gently()
jeffhostetler Jan 26, 2021
a6c0c42
pkt-line: do not issue flush packets in write_packetized_*()
dscho Feb 11, 2020
e24c74b
pkt-line: add PACKET_READ_GENTLE_ON_READ_ERROR option
dscho Feb 10, 2020
c6e4e01
pkt-line: add options argument to read_packetized_to_strbuf()
dscho Feb 11, 2020
0213ed9
simple-ipc: design documentation for new IPC mechanism
jeffhostetler Jul 8, 2020
ae8103d
simple-ipc: add win32 implementation
jeffhostetler Jul 9, 2020
e287fcb
unix-socket: eliminate static unix_stream_socket() helper function
jeffhostetler Jan 27, 2021
5b0f4e9
unix-socket: add backlog size option to unix_stream_listen()
jeffhostetler Jul 28, 2020
d74719a
unix-socket: disallow chdir() when creating unix domain sockets
jeffhostetler Jul 29, 2020
c4bd6a0
unix-stream-server: create unix domain socket under lock
jeffhostetler Feb 9, 2021
ff57cc0
simple-ipc: add Unix domain socket implementation
jeffhostetler Feb 12, 2021
6202d0e
t0052: add simple-ipc tests and t/helper/test-simple-ipc tool
jeffhostetler Jul 8, 2020
db970f9
Merge remote-tracking branch 'jeffhostetler/simple-ipc2' into fsmonitor4
jeffhostetler Feb 22, 2021
5162e1f
Merge branch 'assorted-git-artifacts-fixes'
dscho Mar 5, 2021
f783c65
fsmonitor--daemon: man page and documentation
jeffhostetler Dec 14, 2020
7c17e4f
fsmonitor-ipc: create client routines for git-fsmonitor--daemon
jeffhostetler Dec 14, 2020
88bfdc9
config: FSMonitor is repository-specific
dscho Mar 5, 2021
4654a3e
fsmonitor: introduce `core.useBuiltinFSMonitor` to call the daemon vi…
dscho Aug 2, 2019
5c6a9f8
fsmonitor--daemon: add a built-in fsmonitor daemon
jeffhostetler Dec 16, 2020
1c9f8ec
fsmonitor--daemon: implement client command options
jeffhostetler Dec 16, 2020
c1cf306
fsmonitor-fs-listen-win32: stub in backend for Windows
jeffhostetler Dec 17, 2020
3a7659b
fsmonitor-fs-listen-macos: stub in backend for MacOS
jeffhostetler Dec 17, 2020
fce149c
fsmonitor--daemon: implement daemon command options
jeffhostetler Dec 17, 2020
d49f164
fsmonitor--daemon: add pathname classification
jeffhostetler Dec 17, 2020
6b4070e
fsmonitor--daemon: define token-ids
jeffhostetler Dec 17, 2020
3ffa3f6
fsmonitor--daemon: create token-based changed path cache
jeffhostetler Dec 17, 2020
7952e18
fsmonitor-fs-listen-win32: implement FSMonitor backend on Windows
jeffhostetler Dec 17, 2020
581ab48
fsmonitor-fs-listen-macos: add macos header files for FSEvent
jeffhostetler Dec 18, 2020
4d728d1
fsmonitor-fs-listen-macos: implement FSEvent listener on MacOS
jeffhostetler Dec 18, 2020
bd9d11c
fsmonitor--daemon: implement handle_client callback
jeffhostetler Dec 18, 2020
6cc4f89
fsmonitor--daemon: periodically truncate list of modified files
jeffhostetler Dec 18, 2020
bbf2a4f
fsmonitor--daemon:: introduce client delay for testing
jeffhostetler Dec 18, 2020
df6bf08
fsmonitor--daemon: use a cookie file to sync with file system
jeffhostetler Dec 18, 2020
5fa93cf
fsmonitor: force update index when fsmonitor token advances
jeffhostetler Jan 7, 2021
f9046e5
t7527: create test for fsmonitor--daemon
jeffhostetler Dec 18, 2020
fe048bf
p7519: add fsmonitor--daemon
jeffhostetler Jan 15, 2021
e666cc2
t7527: test status with untracked-cache and fsmonitor--daemon
jeffhostetler Mar 1, 2021
f049558
Merge branch 'fsmonitor4' of https://github.com/jeffhostetler/git
dscho Feb 24, 2021
e758ff7
Merge branch 'fix-fsmonitor-crash'
dscho Mar 2, 2021
44234a2
fsmonitor: mark the built-in FSMonitor as experimental
dscho Mar 5, 2021
60deac2
Enable the built-in FSMonitor as an experimental feature
dscho Mar 5, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Expand Up @@ -71,6 +71,7 @@
/git-format-patch
/git-fsck
/git-fsck-objects
/git-fsmonitor--daemon
/git-gc
/git-get-tar-commit-id
/git-grep
Expand Down
47 changes: 37 additions & 10 deletions Documentation/config/core.txt
Expand Up @@ -66,18 +66,45 @@ core.fsmonitor::
will identify all files that may have changed since the

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we qualify this a little?

If set, the value of this variable is used as a command hook which will...

requested date/time. This information is used to speed up git by
avoiding unnecessary processing of files that have not changed.
See the "fsmonitor-watchman" section of linkgit:githooks[5].
+
See the "fsmonitor-watchman" section of linkgit:githooks[5].

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?? ... and see core.fsmonitorHookVersion ??

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I don't think we need this here, but you're right, I need to mention the core.useBuiltinFSMonitor option in githooks.txt.

+
Note: FSMonitor hooks (and this config setting) are ignored if the
(experimental) built-in FSMonitor is enabled (see
`core.useBuiltinFSMonitor`).

core.fsmonitorHookVersion::
Sets the version of hook that is to be used when calling fsmonitor.
There are currently versions 1 and 2. When this is not set,
version 2 will be tried first and if it fails then version 1
will be tried. Version 1 uses a timestamp as input to determine
which files have changes since that time but some monitors
like watchman have race conditions when used with a timestamp.
Version 2 uses an opaque string so that the monitor can return
something that can be used to determine what files have changed
without race conditions.
Sets the version of hook that is to be used when calling the
FSMonitor hook (as configured via `core.fsmonitor`).
+
There are currently versions 1 and 2. When this is not set,
version 2 will be tried first and if it fails then version 1
will be tried. Version 1 uses a timestamp as input to determine
which files have changes since that time but some monitors
like watchman have race conditions when used with a timestamp.
Version 2 uses an opaque string so that the monitor can return
something that can be used to determine what files have changed
without race conditions.
+
Note: FSMonitor hooks (and this config setting) are ignored if the
built-in FSMonitor is enabled (see `core.useBuiltinFSMonitor`).

core.useBuiltinFSMonitor::
(EXPERIMENTAL) If set to true, enable the built-in filesystem
event watcher (for technical details, see
linkgit:git-fsmonitor--daemon[1]).
+
Like external (hook-based) FSMonitors, the built-in FSMonitor can speed up
Git commands that need to refresh the Git index (e.g. `git status`) in a
worktree with many files. The built-in FSMonitor facility eliminates the
need to install and maintain an external third-party monitoring tool.
+
The built-in FSMonitor is currently available only on a limited set of
supported platforms.
+
Note: if this config setting is set to `true`, any FSMonitor hook
configured via `core.fsmonitor` (and possibly `core.fsmonitorHookVersion`)
is ignored.

core.trustctime::
If false, the ctime differences between the index and the
Expand Down
107 changes: 107 additions & 0 deletions Documentation/git-fsmonitor--daemon.txt
@@ -0,0 +1,107 @@
git-fsmonitor--daemon(1)
========================

NAME
----
git-fsmonitor--daemon - (EXPERIMENTAL) Builtin file system monitor daemon

SYNOPSIS
--------
[verse]
'git fsmonitor--daemon' --start
'git fsmonitor--daemon' --run
'git fsmonitor--daemon' --stop
'git fsmonitor--daemon' --is-running
'git fsmonitor--daemon' --is-supported
'git fsmonitor--daemon' --query <token>
'git fsmonitor--daemon' --query-index
'git fsmonitor--daemon' --flush

DESCRIPTION
-----------

NOTE! This command is still only an experiment, subject to change dramatically
(or even to be abandoned).

Monitors files and directories in the working directory for changes using
platform-specific file system notification facilities.

It communicates directly with commands like `git status` using the
link:technical/api-simple-ipc.html[simple IPC] interface instead of
the slower linkgit:githooks[5] interface.

OPTIONS
-------

--start::
Starts the fsmonitor daemon in the background.

--run::
Runs the fsmonitor daemon in the foreground.

--stop::
Stops the fsmonitor daemon running for the current working
directory, if present.

--is-running::
Exits with zero status if the fsmonitor daemon is watching the
current working directory.

--is-supported::
Exits with zero status if the fsmonitor daemon feature is supported
on this platform.

--query <token>::
Connects to the fsmonitor daemon (starting it if necessary) and
requests the list of changed files and directories since the
given token.
This is intended for testing purposes.

--query-index::
Read the current `<token>` from the File System Monitor index
extension (if present) and use it to query the fsmonitor daemon.
This is intended for testing purposes.

--flush::
Force the fsmonitor daemon to flush its in-memory cache and
re-sync with the file system.
This is intended for testing purposes.

REMARKS
-------
The fsmonitor daemon is a long running process that will watch a single
working directory. Commands, such as `git status`, should automatically
start it (if necessary) when `core.useBuiltinFSMonitor` is set to `true`
(see linkgit:git-config[1]).

Configure the built-in FSMonitor via `core.useBuiltinFSMonitor` in each
working directory separately, or globally via `git config --global
core.useBuiltinFSMonitor true`.

Tokens are opaque strings. They are used by the fsmonitor daemon to
mark a point in time and the associated internal state. Callers should
make no assumptions about the content of the token. In particular,
the should not assume that it is a timestamp.

Query commands send a request-token to the daemon and it responds with
a summary of the changes that have occurred since that token was
created. The daemon also returns a response-token that the client can
use in a future query.

For more information see the "File System Monitor" section in
linkgit:git-update-index[1].

CAVEATS
-------

The fsmonitor daemon does not currently know about submodules and does
not know to filter out file system events that happen within a
submodule. If fsmonitor daemon is watching a super repo and a file is
modified within the working directory of a submodule, it will report
the change (as happening against the super repo). However, the client
should properly ignore these extra events, so performance may be affected
but it should not cause an incorrect result.

GIT
---
Part of the linkgit:git[1] suite
4 changes: 3 additions & 1 deletion Documentation/git-update-index.txt
Expand Up @@ -498,7 +498,9 @@ FILE SYSTEM MONITOR
This feature is intended to speed up git operations for repos that have
large working directories.

It enables git to work together with a file system monitor (see the
It enables git to work together with a file system monitor (see
linkgit:git-fsmonitor--daemon[1]
and the
"fsmonitor-watchman" section of linkgit:githooks[5]) that can
inform it as to what files have been modified. This enables git to avoid
having to lstat() every file to find modified files.
Expand Down
3 changes: 2 additions & 1 deletion Documentation/githooks.txt
Expand Up @@ -584,7 +584,8 @@ fsmonitor-watchman

This hook is invoked when the configuration option `core.fsmonitor` is
set to `.git/hooks/fsmonitor-watchman` or `.git/hooks/fsmonitor-watchmanv2`
depending on the version of the hook to use.
depending on the version of the hook to use, unless overridden via
`core.useBuiltinFSMonitor` (see linkgit:git-config[1]).

Version 1 takes two arguments, a version (1) and the time in elapsed
nanoseconds since midnight, January 1, 1970.
Expand Down
105 changes: 105 additions & 0 deletions Documentation/technical/api-simple-ipc.txt
@@ -0,0 +1,105 @@
Simple-IPC API
==============

The Simple-IPC API is a collection of `ipc_` prefixed library routines
and a basic communication protocol that allow an IPC-client process to
send an application-specific IPC-request message to an IPC-server
process and receive an application-specific IPC-response message.

Communication occurs over a named pipe on Windows and a Unix domain
socket on other platforms. IPC-clients and IPC-servers rendezvous at
a previously agreed-to application-specific pathname (which is outside
the scope of this design) that is local to the computer system.

The IPC-server routines within the server application process create a
thread pool to listen for connections and receive request messages
from multiple concurrent IPC-clients. When received, these messages
are dispatched up to the server application callbacks for handling.
IPC-server routines then incrementally relay responses back to the
IPC-client.

The IPC-client routines within a client application process connect
to the IPC-server and send a request message and wait for a response.
When received, the response is returned back the caller.

For example, the `fsmonitor--daemon` feature will be built as a server
application on top of the IPC-server library routines. It will have
threads watching for file system events and a thread pool waiting for
client connections. Clients, such as `git status` will request a list
of file system events since a point in time and the server will
respond with a list of changed files and directories. The formats of
the request and response are application-specific; the IPC-client and
IPC-server routines treat them as opaque byte streams.


Comparison with sub-process model
---------------------------------

The Simple-IPC mechanism differs from the existing `sub-process.c`
model (Documentation/technical/long-running-process-protocol.txt) and
used by applications like Git-LFS. In the LFS-style sub-process model
the helper is started by the foreground process, communication happens
via a pair of file descriptors bound to the stdin/stdout of the
sub-process, the sub-process only serves the current foreground
process, and the sub-process exits when the foreground process
terminates.

In the Simple-IPC model the server is a very long-running service. It
can service many clients at the same time and has a private socket or
named pipe connection to each active client. It might be started
(on-demand) by the current client process or it might have been
started by a previous client or by the OS at boot time. The server
process is not associated with a terminal and it persists after
clients terminate. Clients do not have access to the stdin/stdout of
the server process and therefore must communicate over sockets or
named pipes.


Server startup and shutdown
---------------------------

How an application server based upon IPC-server is started is also
outside the scope of the Simple-IPC design and is a property of the
application using it. For example, the server might be started or
restarted during routine maintenance operations, or it might be
started as a system service during the system boot-up sequence, or it
might be started on-demand by a foreground Git command when needed.

Similarly, server shutdown is a property of the application using
the simple-ipc routines. For example, the server might decide to
shutdown when idle or only upon explicit request.


Simple-IPC protocol
-------------------

The Simple-IPC protocol consists of a single request message from the
client and an optional response message from the server. Both the
client and server messages are unlimited in length and are terminated
with a flush packet.

The pkt-line routines (Documentation/technical/protocol-common.txt)
are used to simplify buffer management during message generation,
transmission, and reception. A flush packet is used to mark the end
of the message. This allows the sender to incrementally generate and
transmit the message. It allows the receiver to incrementally receive
the message in chunks and to know when they have received the entire
message.

The actual byte format of the client request and server response
messages are application specific. The IPC layer transmits and
receives them as opaque byte buffers without any concern for the
content within. It is the job of the calling application layer to
understand the contents of the request and response messages.


Summary
-------

Conceptually, the Simple-IPC protocol is similar to an HTTP REST
request. Clients connect, make an application-specific and
stateless request, receive an application-specific
response, and disconnect. It is a one round trip facility for
querying the server. The Simple-IPC routines hide the socket,
named pipe, and thread pool details and allow the application
layer to focus on the application at hand.
24 changes: 24 additions & 0 deletions Makefile
Expand Up @@ -464,6 +464,11 @@ all::
# directory, and the JSON compilation database 'compile_commands.json' will be
# created at the root of the repository.
#
# If your platform supports an built-in fsmonitor backend, set
# FSMONITOR_DAEMON_BACKEND to the name of the corresponding
# `compat/fsmonitor/fsmonitor-fs-listen-<name>.c` that implements the
# `fsmonitor_fs_listen__*()` routines.
#
# Define DEVELOPER to enable more compiler warnings. Compiler version
# and family are auto detected, but could be overridden by defining
# COMPILER_FEATURES (see config.mak.dev). You can still set
Expand Down Expand Up @@ -736,6 +741,7 @@ TEST_BUILTINS_OBJS += test-serve-v2.o
TEST_BUILTINS_OBJS += test-sha1.o
TEST_BUILTINS_OBJS += test-sha256.o
TEST_BUILTINS_OBJS += test-sigchain.o
TEST_BUILTINS_OBJS += test-simple-ipc.o
TEST_BUILTINS_OBJS += test-strcmp-offset.o
TEST_BUILTINS_OBJS += test-string-list.o
TEST_BUILTINS_OBJS += test-submodule-config.o
Expand Down Expand Up @@ -882,6 +888,7 @@ LIB_OBJS += fetch-pack.o
LIB_OBJS += fmt-merge-msg.o
LIB_OBJS += fsck.o
LIB_OBJS += fsmonitor.o
LIB_OBJS += fsmonitor-ipc.o
LIB_OBJS += gettext.o
LIB_OBJS += gpg-interface.o
LIB_OBJS += graph.o
Expand Down Expand Up @@ -1081,6 +1088,7 @@ BUILTIN_OBJS += builtin/fmt-merge-msg.o
BUILTIN_OBJS += builtin/for-each-ref.o
BUILTIN_OBJS += builtin/for-each-repo.o
BUILTIN_OBJS += builtin/fsck.o
BUILTIN_OBJS += builtin/fsmonitor--daemon.o
BUILTIN_OBJS += builtin/gc.o
BUILTIN_OBJS += builtin/get-tar-commit-id.o
BUILTIN_OBJS += builtin/grep.o
Expand Down Expand Up @@ -1667,6 +1675,14 @@ ifdef NO_UNIX_SOCKETS
BASIC_CFLAGS += -DNO_UNIX_SOCKETS
else
LIB_OBJS += unix-socket.o
LIB_OBJS += unix-stream-server.o
LIB_OBJS += compat/simple-ipc/ipc-shared.o
LIB_OBJS += compat/simple-ipc/ipc-unix-socket.o
endif

ifdef USE_WIN32_IPC
LIB_OBJS += compat/simple-ipc/ipc-shared.o
LIB_OBJS += compat/simple-ipc/ipc-win32.o
endif

ifdef NO_ICONV
Expand Down Expand Up @@ -1881,6 +1897,11 @@ ifdef NEED_ACCESS_ROOT_HANDLER
COMPAT_OBJS += compat/access.o
endif

ifdef FSMONITOR_DAEMON_BACKEND
COMPAT_CFLAGS += -DHAVE_FSMONITOR_DAEMON_BACKEND
COMPAT_OBJS += compat/fsmonitor/fsmonitor-fs-listen-$(FSMONITOR_DAEMON_BACKEND).o
endif

ifeq ($(TCLTK_PATH),)
NO_TCLTK = NoThanks
endif
Expand Down Expand Up @@ -2731,6 +2752,9 @@ GIT-BUILD-OPTIONS: FORCE
@echo PAGER_ENV=\''$(subst ','\'',$(subst ','\'',$(PAGER_ENV)))'\' >>$@+
@echo DC_SHA1=\''$(subst ','\'',$(subst ','\'',$(DC_SHA1)))'\' >>$@+
@echo X=\'$(X)\' >>$@+
ifdef FSMONITOR_DAEMON_BACKEND
@echo FSMONITOR_DAEMON_BACKEND=\''$(subst ','\'',$(subst ','\'',$(FSMONITOR_DAEMON_BACKEND)))'\' >>$@+
endif
ifdef TEST_OUTPUT_DIRECTORY
@echo TEST_OUTPUT_DIRECTORY=\''$(subst ','\'',$(subst ','\'',$(TEST_OUTPUT_DIRECTORY)))'\' >>$@+
endif
Expand Down
1 change: 1 addition & 0 deletions builtin.h
Expand Up @@ -158,6 +158,7 @@ int cmd_for_each_ref(int argc, const char **argv, const char *prefix);
int cmd_for_each_repo(int argc, const char **argv, const char *prefix);
int cmd_format_patch(int argc, const char **argv, const char *prefix);
int cmd_fsck(int argc, const char **argv, const char *prefix);
int cmd_fsmonitor__daemon(int argc, const char **argv, const char *prefix);
int cmd_gc(int argc, const char **argv, const char *prefix);
int cmd_get_tar_commit_id(int argc, const char **argv, const char *prefix);
int cmd_grep(int argc, const char **argv, const char *prefix);
Expand Down
3 changes: 2 additions & 1 deletion builtin/credential-cache--daemon.c
Expand Up @@ -203,9 +203,10 @@ static int serve_cache_loop(int fd)

static void serve_cache(const char *socket_path, int debug)
{
struct unix_stream_listen_opts opts = UNIX_STREAM_LISTEN_OPTS_INIT;
int fd;

fd = unix_stream_listen(socket_path);
fd = unix_stream_listen(socket_path, &opts);
if (fd < 0)
die_errno("unable to bind to '%s'", socket_path);

Expand Down