Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

User-space recursive watcher #18

Open
nathany opened this issue Jun 29, 2014 · 36 comments
Open

User-space recursive watcher #18

nathany opened this issue Jun 29, 2014 · 36 comments

Comments

@nathany
Copy link
Contributor

nathany commented Jun 29, 2014

@jbowtie requested this feature at howeyc/fsnotify#56.

Windows and FSEvents (OS X) can watch a directory tree, in fact FSEvents is built entirely for this purpose:

if you monitor a directory with FSEvents then the monitor is recursive; there is no non-recursive option - @samjacobson howeyc/fsnotify#54 (comment)

Ideally fsnotify would expose a way to use this functionality on Windows and Macs.

Unfortunately inotify (Linux, Android), kqueue (BSD, iOS), and FEN (Solaris) don't have this capability.

There is some debate as to whether or not fsnotify should include a user-space recursive watcher to make up the difference. If not, perhaps it can be implemented a separate wrapper or utility library. In any case, it's something often requested, so we should definitely be thinking about it.

Implementation

In two parts:

  1. Walk subdirectories to Add (or Remove) watches for each directory:
    • To avoid watching too much, it may be important to (optionally?) skip hidden directories (.git, .hg).
    • My proposal to configure Ops filtering globally at the Watcher level (Built-in event filtering of Ops #7) simplifies recursive watching.
  2. Listen for incoming Create events to watch additional directories as they are created (again, avoiding hidden directories).
@nathany
Copy link
Contributor Author

nathany commented Jun 29, 2014

Godoc is a relatively small program. It is built from 102 packages built from 582 source files. We certainly want to be able to build programs larger than godoc. It appears that Linux inotify will let you watch individual directories for changes within that directory, so for godoc you are looking at a little over 102 inotify watches. For OS X, fsevents will let you watch whole subtrees, so for godoc built from 1 GOROOT and 1 GOPATH entry you are looking at 2 fsevents watches. For Microsoft Windows, FindFirstChangeNotification looks like it might be usable similar to fsevents. For Solaris, file event notifications (FEN) only apply to individual files or directories, and watching a directory inode does not appear to tell you about modifications made to files in that directory, so for godoc you are looking at almost 700 FEN watches. That might take a little while to set up, but assuming the kernel has no hard limits, it should be fine. Speaking of limits... For BSD (or OS X if you don't like fsevents), kqueue has the same "enumerate every file or directory" requirement as Solaris FEN, but you give them to the kernel not as file names but as file descriptors. It appears that the file descriptor must remain open while you are watching that file, so the per-process file descriptor limit imposes a limit on the number of things you can watch. Worse, the per-system file descriptor limit imposes a limit on the number of things anyone on the system can watch. The typical kernel limit on a BSD is only on the order of 10,000 file descriptors for the entire machine. - @rsc, https://groups.google.com/d/msg/golang-dev/bShm2sqbrTY/IR8eI20Su8EJ

  • The limits for FEN (Solaris) are high enough (File Events Notifications on Solaris #12) that a recursive watcher should be possible.
  • It's possible that kqueue (BSD, iOS) can't reasonably support a recursive watcher. For large trees, users may need to resort to Polling (Polling fallback #9) or another mechanism.

@aktau
Copy link

aktau commented Jul 4, 2014

The question becomes then if we want to fallback to whatever is the most efficient way of doing subtree wachtes on an OS that succeeds, and if we want to tell the user that the fallback happened. Or if the user should be able to configure whether a fallback is allowed. The reason this might be handy is that a user might prefer to not have directory watching at all instead of polling, which might put too much stress on the system. Plainly telling the user that the feature is disabled or something.

I, for one, would absolutely love subtree watches, but understand the difficulties on getting it right for cross-platform.

@purpleidea
Copy link

Is the implementation in the https://github.com/xyproto/recwatch package correct, or are there issues? I ask because it looks as if this is a more subtle, tricky problem. cc @nathany @xyproto

Cheers

@nathany
Copy link
Contributor Author

nathany commented Oct 13, 2015

It should work. Most of the tricky issues are with too many resources being consumed. Also, we would ideally take advantage of native recursive watching on the operating systems that support it.

@landaire

This comment was marked as spam.

@nathany
Copy link
Contributor Author

nathany commented Mar 3, 2016

No update for fsnotify, but take a look at https://github.com/rjeczalik/notify as an alternative.

@xyproto
Copy link

xyproto commented Mar 3, 2016

I use recwatch in algernon, which works great on Linux/Windows/OS X. The only gotcha is to watch out for OS limitations in how many files can be watched. Especially on OS X the ulimit is very low by default.

@nathany
Copy link
Contributor Author

nathany commented Mar 3, 2016

Rafal's notify library uses FSEvents on OS X instead of kqueue, so it doesn't have that problem on OS X.

@xyproto
Copy link

xyproto commented Mar 3, 2016

FSEvents sounds like a good approach. The notify package has a few unresolved issues, though.

@nathany
Copy link
Contributor Author

nathany commented Mar 3, 2016

As does fsnotify.

@unixist
Copy link

unixist commented Apr 13, 2016

Neither github.com/rjeczalik/notify nor recwatch place a watch on newly-added directories. This is fine if the directory structure is static, so for things like anomaly detection in a server environment or corporate asset.

I have a use case (github.com/unixist/cryptostalker) that aims to detect crypto ransom malware on end-user computers. In such an environment, directories are expected to come and go all the time.

Is there a potential for resource exhaustion that I'm unaware of, leading to this feature not being implemented more widely (and recursive at all for fsnotify)?

@unixist
Copy link

unixist commented Apr 13, 2016

OK that might be a fluke in my testing since I created directories with mkdir -p. Still any comments welcome.

@landaire
Copy link

@unixist not sure what system you're on but I was able to get recwatch to work on OS X with no issues besides having to update it to use this fsnotify package instead of the old one hosted under a different user/org (although I didn't try mkdir -p). You can see the code I used here.

@unixist
Copy link

unixist commented Apr 13, 2016

@landaire, I tore down my test system, but I'm thinking it was a premature post. Though I'm unsure what it was about the multiple directory creation that threw fsnotify for a loop.

I'd be interested to know what you find doing the mkdir -p test on your code.

@ppknap
Copy link
Contributor

ppknap commented Apr 13, 2016

Neither github.com/rjeczalik/notify nor recwatch place a watch on newly-added directories.

Could you provide test case that reproduces this? @rjeczalik's notify package should add new directories when watching recursively. It is a bug if it doesn't.

@unixist
Copy link

unixist commented Apr 13, 2016

I tore down my test environment, but you can test by placing a watch on
$dir and then doing "mkdir -p $dir/foo/bar/baz". I tested this on OSX.
Maybe the underlying file system event handling differs by platform.

On Wed, Apr 13, 2016 at 12:07 AM, Pawel Knap notifications@github.com
wrote:

Neither github.com/rjeczalik/notify nor recwatch place a watch on
newly-added directories.

Could you provide test case that reproduces this? @rjeczalik
https://github.com/rjeczalik's notify package should add new
directories when watching recursively. It is a bug if it doesn't.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#18 (comment)

@nathany
Copy link
Contributor Author

nathany commented Apr 15, 2016

Maybe the underlying file system event handling differs by platform.

Yes, it does.

@Dominik-K
Copy link

Dominik-K commented Mar 30, 2017

Hi. The fswatch may be a good place for fsnotify, especially for this issue. They state it has "no known limitations" on Solaris kernels (with FEN). I.e. also with recursive monitoring.

@dc0d

This comment was marked as spam.

@hitzhangjie
Copy link

Could someone provides an example about how to work around this problem?

I want to watch the generated folders and files underneath by some tools.

@maomao2523
Copy link

I have made a little change ,it work orderly ,how about that
image

@arp242 arp242 removed this from the v3 Recursive milestone Jul 29, 2022
@arp242 arp242 removed the API label Jul 29, 2022
arp242 added a commit that referenced this issue Nov 17, 2022
Recursive watches can be added by using a "/..." parameter, similar to
the Go command:

	w.Add("dir")      // Behaves as before.
	w.Add("dir/...")  // Watch dir and all paths underneath it.

	w.Remove("dir")      // Remove the watch for dir and, if
	                     // recursive, all watches underneath it too

	w.Remove("dir/...")  // Behaves like just "dir" if the path was
	                     // recursive, error otherwise (probably
	                     // want to add recursive remove too at some
	                     // point).

The advantage of using "/..." vs. an option is that it can be easily
specified in configuration files and the like.

This should be expanded to other backends too; I started with Windows
because the implementation is the both the easiest and has the least
amount of control (just setting a boolean parameter), and we can focus
mostly on writing tests for it, and we can then match the inotify and
kqueue behaviour to the Windows one.

Updates #18
arp242 added a commit that referenced this issue Nov 17, 2022
Recursive watches can be added by using a "/..." parameter, similar to
the Go command:

	w.Add("dir")      // Behaves as before.
	w.Add("dir/...")  // Watch dir and all paths underneath it.

	w.Remove("dir")      // Remove the watch for dir and, if
	                     // recursive, all watches underneath it too

	w.Remove("dir/...")  // Behaves like just "dir" if the path was
	                     // recursive, error otherwise (probably
	                     // want to add recursive remove too at some
	                     // point).

The advantage of using "/..." vs. an option is that it can be easily
specified in configuration files and the like.

This should be expanded to other backends too; I started with Windows
because the implementation is the both the easiest and has the least
amount of control (just setting a boolean parameter), and we can focus
mostly on writing tests for it, and we can then match the inotify and
kqueue behaviour to the Windows one.

Updates #18
arp242 added a commit that referenced this issue Nov 17, 2022
Recursive watches can be added by using a "/..." parameter, similar to
the Go command:

	w.Add("dir")         // Behaves as before.
	w.Add("dir/...")     // Watch dir and all paths underneath it.

	w.Remove("dir")      // Remove the watch for dir and, if
	                     // recursive, all paths underneath it too

	w.Remove("dir/...")  // Behaves like just "dir" if the path was
	                     // recursive, error otherwise (probably
	                     // want to add recursive remove too at some
	                     // point).

The advantage of using "/..." vs. an option is that it can be easily
specified in configuration files and the like; for example from a TOML
file:

	[watches]
	dirs = ["/tmp/one", "/tmp/two/..."]

Options for this were previously discussed at:
#339 (comment)

This should be expanded to other backends too; I started with Windows
because the implementation is the both the easiest and has the least
amount of control (just setting a boolean parameter), and we can focus
mostly on writing tests and documentation and the for it, and we can
then match the inotify and kqueue behaviour to the Windows one.

Updates #18
@myitcv
Copy link

myitcv commented May 5, 2023

In case it's of interest to folks following here, we just created https://pkg.go.dev/github.com/cue-lang/cuelang.org@v0.0.0-20230505131944-cc11e9153697/internal/fsnotify. As the package documentation notes:

Package fsnotify is a light wrapper around github.com/fsnotify/fsnotify that allows for recursively watching directories, and provides a simple wrapper for batching events.

Note this is only availalable in the alpha branch for now:

https://github.com/cue-lang/cuelang.org/tree/alpha/internal/fsnotify

I have not had great experience with recursive watcher implementations in the past, so one goal here was to establish good tests to cover edge cases.

Feedback welcome on the API.

@shayneoneill
Copy link

Are there any other libraries that do the recursive watcher thing? Its sadly bit of a non starter for me without it (The tree being watched is kind of labrynth)

@pablodz
Copy link

pablodz commented Jul 12, 2023

Are there any other libraries that do the recursive watcher thing? Its sadly bit of a non starter for me without it (The tree being watched is kind of labrynth)

If running on linux, try https://github.com/pablodz/inotifywaitgo was a workaround that I developed for this edge case

@imsodin
Copy link

imsodin commented Jul 12, 2023

There is also https://github.com/rjeczalik/notify - still a potential project on my back-backlog to merge the two projects. Basically use fsnotify within notify for the underlying non-recursive watching. (Also for the avoidance of doubt: Back-backlog means unlikely to ever happen :) ).

@shayneoneill
Copy link

shayneoneill commented Jul 31, 2023

It might be worth putting up a bit of notice on the front page as unfortunately its not really obvious that its not supported until it doesnt work, and I suspect 90% of use-cases would require that recursive watch (As in I cant imagine any usecases where it wouldnt be a requirement?),

edit: My bad, it is actually in the FAQ.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests