Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Listener does not follow symlinks #25

Closed
akerbos opened this issue Apr 25, 2012 · 71 comments
Closed

Listener does not follow symlinks #25

akerbos opened this issue Apr 25, 2012 · 71 comments
Labels
✨ Feature Adds a new feature
Milestone

Comments

@akerbos
Copy link

akerbos commented Apr 25, 2012

Consider the following structure:

main
|- sub1
|  |- file1
|- sub2 --> somewhere
   |- file2

where sub2 is a symlink to some other (read- and writable) directory. Assume we watch main and ignore nothing.

While changes to file1 are dutifully reported, changes to file2 are not.

@ttilley
Copy link
Member

ttilley commented Apr 25, 2012

With this scenario you aren't watching one path any more, so it's not going to be sane to make consistent across backends.

...which does bring up a good point that it should probably be documented somewhere at the least.

@akerbos
Copy link
Author

akerbos commented Apr 25, 2012

Well, if I follow the symlink via cd, pwd will suggest that I am in a subfolder of main.

In any case, afaik listen can do multiple watch directories, so maybe it can automatically detect and treat symlinks this way? (Beware the cycles!)

@thibaudgg
Copy link
Member

Yeah, we should be able to detect symlink and automatically watch the symlinked directories. 3 questions come to my mind:

  1. What about symlinked files?
  2. The paths reported should be the symlinked one or the original?
  3. What about directories with a lot of symlinks?

@Maher4Ever looks feasible to you? It is a good idea?

@akerbos
Copy link
Author

akerbos commented Apr 26, 2012

ad 1: Watch them, of course!

ad 2: I think the original ones would cause confusion. On GNU/Linux, symlinks are transparent for the user most of the time; this should extend to out situation here.

ad 3: Is there a reason to treat them differently from directories with a lot of files?

@ttilley
Copy link
Member

ttilley commented Apr 26, 2012

So if I'm watching a remote SSHFS mount... Obviously I don't want to do a recursive search for symlinks, never mind multiple levels down.

Mac fseventd calls realpath() on anything submitted to it so you don't have much choice in the matter when it comes to what to report. It's a bit frustrating actually, as the mac specific and unix APIs treat /private differently (among other things).

Perhaps following symlinks, if you want to provide that functionality, can be an optional feature?

Sent from my iPhone

On Apr 26, 2012, at 3:04 AM, Thibaud Guillaume-Gentilreply@reply.github.com wrote:

Yeah, we should be able to detect symlink and automatically watch the symlinked directories. 3 questions come to my mind:

  1. What about symlinked files?
  2. The paths reported should be the symlinked one or the original?
  3. What about directories with a lot of symlinks?

@Maher4Ever looks feasible to you? It is a good idea?


Reply to this email directly or view it on GitHub:
#25 (comment)

@Maher4Ever
Copy link
Contributor

This is an interesting bug/feature-request, namely because there are differences between what a "symbolic link" means across operating systems. Beside the notes that @ttilley mentioned above, I'm also concerned with how this feature should work on Windows. I initually thought Windows had no notion of a symlink, but it seems I was wrong.

@thibaudgg To answer your questions:

  • Symlinked files does get picked up by adapters (I just tested in Linux), so I don't think we should do anything more about this.
  • Returning the original paths would be easier to implement, but I agree with @akerbos that it would be very confusing to users. How about we add the option to choose which one to return?
  • In this sitiuation the ruby process would use a huge amount of memory, which might be confusing to some users if they only have a few symlinks in the watched directory. This is why I agree with @ttilley that this feature should be an opt-in feature (disabled by default).

Nevertheless, I think it is a good feature to have in Listen. So props for @akerbos for reporting this limitation. I'll investigate the feasibility more this weekend before deciding how to implement this. @thibaudgg Seems good to you?

@akerbos
Copy link
Author

akerbos commented Apr 26, 2012

@ttilley As you mention it, most GNU tools allow you to switch following symlinks off. I think this is a very reasonable thing to do.

@Maher4Ever I have never used any Windows after XP, so I'd have to trust Wikipedia: "Symbolic links are designed to aid in migration and application compatibility with POSIX operating systems. Microsoft aimed for Vista's symbolic links to 'function just like UNIX links'". So hope is not lost?

As for the reported paths, switching between both is certainly the most defensive way of approaching things, especially because operating systems may have different conventions in the first place.

Is following symlinks more expensive than looking into a "local" folder, assuming both the current folder and the link's target reside on the same file system (or at least machine)? I'd say users are responsible for what they set out to watch; it is not as if symlinks just pop up, you add them specificly to include something in that place (that said, make sure to ignore ..! :>). So my vote is for opt-out, but I can live with opt-in, too.

Good luck in implementing this, I am confident you will find a good solution!

@thibaudgg
Copy link
Member

@Maher4Ever Yeah sounds good to me, I would be conservative (but I stay curious ;-)) and choose the opt-in too. Thanks!

@Maher4Ever
Copy link
Contributor

After I looked at how the OS-monitors work when it comes to symbolic directories, here is what I've learned:

  • Linux: Inotify does follow symbolic directories and reports changes with paths inside the symbolic directory (not the realpath where the event actually happened).
  • Mac: fsevent does not follow symbolic directories.
  • Windows: I'm still not sure if we would keep supporting FChange for Windows (because of the performance issues), so I didn't check it there yet.
  • All systems: polling could support symbolic directories if a list of all files inside that symbolic directory are stored for comparison later on.

With these facts, we would need to implement this feature almost differently across the adapters, which is not how the adapters work right now. That's because on Mac a watcher would need to be assigned to each symbolic directory, but inotify on linux won't. Polling would need some major changes to support symbolic directories.

As @ttilley mentioned above, Mac returns the realspaths of the changes. inofity on the other hand don't. This means some sort of paths conversion would be necessarily based on which type of paths we want to report. Even after this, I wonder if we could keep the results (paths) consistent across the adapters.

What we could actually do is to add the :follow_symlinks to the multi-lisener. I like this idea for two reasons:

  1. Using the multi-listener would suggest that we consider the symlinks as separate directories. The user could expect that we would return the realpaths of the changes, not paths inside the symlink (link inotify does).
  2. The :relative_paths is already disabled on the multi-listener, which we don't want for the symlinks.

The multi-listnter would accept a single normal directory. Then we could simply scan the watched directory for any symbolic directory and add it to the watched directories. This approach would also keep the results consistent, because no magic would be needed to ensure the consistency across the adapters.

I would love to know your thoughts about the second approach, as it seems the least problematic one to me.

@akerbos
Copy link
Author

akerbos commented May 5, 2012

Huh, I guess Inotify is not used on my (Linux) machines, then?

I think your (second) proposal breaks consistency a bit. "Listening to a single directory works with A, but not if it contains symlinks; use B then. B is also used to listen to multiple directories." This does not make much sense from a user's point of view. Also, I have to know beforehand whether my directory contains symlinks or not; I might end up using the multi-listener just in case. Which prompts the question: why are there even two components? As far as I can see, the multi-listener is (conceptually) only a wrapper around multiple single-listeners; if that is correct, hide away the single one. This would make for a narrower interface, which is always good.

That said, handling symlinked (sub)directories as if they were independently watched directories feels awkward, but I can't see any hard disadvantages. Tools using listen might be confused because users provide three paths but seven different paths are reported; if the behaviour is well-documented (which I assume) this is no problem, I guess. Rewriting the paths to match the originally provided hierarchy is probably the most intuitive to use, but causes a runtime penalty. Trade-off time!

@thibaudgg
Copy link
Member

@ttilley I think all points made by @akerbos are valid, it would be great to have the same behavior on both side even if we need to remove the :relative_paths option for the SingleListener for that.

@Maher4Ever
Copy link
Contributor

I did some more digging lately to see how to get following symlinks working on Linux and Mac in a consistent way and It seems possible without the weirdness of using the MultiListener approach. As noted by @akerbos, using the MultiListener apporach to implement this feature would require the user to know beforehand if the watched directory has symlinks.

@akerbos The only difference between the Listener and the MultiListener is the ability to get relative paths in the callback when using the Listener. Since Listen is still relatively a new gem, I can't say for sure that the relative-paths feature will be used (or not), so I think waiting for a while will be wise before removing one of the Listners.

@thibaudgg I'll implement this feature in the adapters and the directory-record, so the listeners won't have to change at all.

@thibaudgg
Copy link
Member

@Maher4Ever that's perfect. Thanks!

@akerbos
Copy link
Author

akerbos commented Oct 18, 2012

@Maher4Ever My currect LaTeX project is packed with symlinks, so I thought I'd ping this one. Are there plans for working on this in the near future?

@akerbos
Copy link
Author

akerbos commented Oct 18, 2012

I just tried it with Ruby 1.9.3, listen 0.5.3 and rb-inotify 0.8.8. It does seem to look into symlinks, but it does not detect cycles/loops and consequently does not terminate. (rb-inotify is apparently not a dependency of listen, I had to install it manually. Bug or feature?)

Without rb-inotify, symlinks are not followed (same as before).

I posted an issue at rb-inotify, too.

@thibaudgg
Copy link
Member

@akerbos Yeah it's a feature, you need to install rb-inotify manually. If not you got a warning message.

@akerbos
Copy link
Author

akerbos commented Oct 18, 2012

@thibaudgg Right, I saw the warning only later.

Huh, now I can't reproduce listening in symlinked folders, even with rb-inotify. O.o (It's there, I checked with an explicity require 'rb-inotify).

@dwt
Copy link

dwt commented Nov 2, 2012

I'd like to add another usecase as to the value of following symlinks. We have a product that is customized to many customers, and we switch customers by linking in some configuration files via symlinks from a 'customer' folder.

Now I want to trigger a listen action whenever one of those files changes - but I'd like to watch only the symlinks as to not trigger the action when any of the customers (many of course) are touched.

So some way of getting listen to watch symlinks is greatly apreciated.

@akerbos
Copy link
Author

akerbos commented Nov 2, 2012

Correcting my earlier comment: watching files in symlinked folders works (at least with rb-inotify), but changes in symlinked files are not detected.

@thibaudgg
Copy link
Member

Closed because no activities, feel free to reopen if interest is back.

@akerbos
Copy link
Author

akerbos commented Apr 6, 2013

@thibaudgg I'm still interested in watching changes in symlinked files; or did you mean interest from contributors?

@thibaudgg
Copy link
Member

Yeah contributors, I personally don't need and don't have the time to implement that. But a pull request with that feature would be very welcome!

@akerbos
Copy link
Author

akerbos commented Apr 24, 2013

In that case, you should maybe leave the issue open so contributors can see it as, well, open issue. :)

@thibaudgg
Copy link
Member

@akerbos that's fair enough. :)

@thibaudgg thibaudgg reopened this Apr 24, 2013
@aspiers
Copy link

aspiers commented May 13, 2013

Ugh, I just wasted 2 hours tracking this down before finding out it was a known issue :-( It's a pretty bad bug, so IMHO it should be listed in the README ...

@thibaudgg
Copy link
Member

@aspiers sorry to hear that, I just added a pending features list in the README. 5a1dfe1

@e2
Copy link
Contributor

e2 commented Apr 19, 2014

TL;DR: there are plans ahead for new listen API that could help here

I'm planning a new API for listen and some features, which would include multiple adapters listening on different directories and running with different options.

Then, for some, the quick-and-dirty solution may then be to just setup a TCP listener - and then run broadcasting listen instances in the directories.

The effect? Same as symlinks giving relative paths. It's not as convenient to set up, but in a worst case scenario at least polling is done by a separate process ;)

From a user's perspective, the key is watching directories (and not files - as many intuitively want to) - and then the adapter (!) should always translate absolute paths relative to the directory it was told to listen to. That's so apps would work.

There are two topics to separate: changes and notifications.

E.g. you could have one file changed, but due to symlinks, you may need many notifications.

And that's why you can't "resolve" symlinks.

Watching symlinked content conceptually means:

  1. watching the directory(s) with the symlink(s)
  2. likely watching the directory containing the link target(s)
  3. watching the linked target itself (usually - only if it's a directory)

Because:

  1. If the symlink changes, the path to the same content may no longer be accessible
  2. If the symlink target is renamed or deleted, same thing
  3. And of course, if you move a subdirectory within the linked target directory, the files there are no longer accessible.

This means watching at least 3 objects for one symlink - and then the adapter would have to decide whether all the "different" notifications from each are really different (not as absolute paths - but if they're on the same path that was given for watching (which could contain symlinks).

Just as if you'd watch the same directory 3 times, it makes sense to get 3 identical notifications. But if you're watching 1 and the adapter is setting up many - the adapter has to remove "duplicates" (which depends not on absolute paths, but on symlinks along the way).

The ideal solution: letting the user decide which directories to watch and for what purpose - and maybe even how to "route" or "map" paths. That means a lot less notifications, a lot less confusion, more flexibility, etc.

"Mapping/routing" may not seem to make sense unless you realize that notifications don't have to produce accurate paths, as long as you know which tools to run (in response), and as long as those tools know where things are.

Case in point - TCP listener, where none of the notifications represent accessible paths if they're on another machine - symlinked or not.

I think the inotify adapter is a great reference, because it implements it's own (separate from inotify) recursion handling (but recommends using absolute paths anyway) and when it "doesn't work as expected" it's usually for good design reasons.

Anyway, I'm interested in feedback about this.

P.S. To balance things out, the current available event types will be "dumbed down" and simplified - but for good reasons.

@thibaudgg
Copy link
Member

👍

@brauliobo
Copy link

reproduced here with guard-sass and guard 1.8.3, very annoying bug...
a fragile workaround was to run:

sass_locations.map!{ |l| l = "#{File.dirname l}/#{File.readlink l}" while File.symlink? l; l }

@e2
Copy link
Contributor

e2 commented Sep 26, 2014

After lots of thinking, resolving symlinks should be done at the application level (if possible) or the directories should be reorganized so that watched files are "physically" in the watched directory.

If someone has an issue that cannot be worked around without changes in listen (or a specific adapter), it's best to have a valid real-life example to work with, because otherwise there are just too many edge cases to deal with (and everyone will have a different opinion about the "most intuitive" solution).

Since adapters can't work consistently (by definition), I think it's fine to have more adapter-specific options to let users tweak the behavior to whatever they need.

@brauliobo
Copy link

@e2 somebody told it is possible to watch symlinks with inotify adapter, but I could not find a way to select this adaptor with guard. how to do it?

@akerbos
Copy link
Author

akerbos commented Sep 26, 2014

@e2, I'm not sure I follow. On Linux, symlinks are essentially transparent; it's not at all clear to me where problems (beyond cycles, which are easy to detect) should arise. From an algorithmic point of view, breadth-first search (up to a certain depth, as an option) on the file system graph, starting in all specified roots, should yield a well-defined set of inodes that are listened to. Can you please detail what issues you see?

As for a real-life example, I create exercise sheets for several lectures across several years. All of them include parts of the same (LaTeX) preamble, those of the same lecture share other parts and get their problems from a central folder per lecture (which survives the years). Of course, I set up all of this with symlinks.

@e2
Copy link
Contributor

e2 commented Sep 26, 2014

@brauliobo - inotify is the native adapter on Linux, and it's used unless you force polling

@e2
Copy link
Contributor

e2 commented Sep 26, 2014

@akerbos - to avoid a long discussion, do you personally have a specific issue with symlinks on Linux that doesn't currently work as you'd want it to?

Regarding loops, there's: #259 (although I haven't investigated where the loop is traversed exactly - there's also an issue in rb-inotify - guard/rb-inotify#21).

In short, I don't want Listen to list "symlink support", because:

  1. means different things to different people (!)
  2. people will make false assumptions (at the very least getting already resolved paths can be surprising - and can break apps relying on path substitution)
  3. every adapter gives different results on plain files and directories anyway
  4. writing acceptance tests for such cases is not worth the effort (as of now)
  5. since results are adapter specific, it's better for people to use specialized libraries for their scenario (especially if their needs are uncommon)
  6. listen is too complex as it is to take on responsibilities easily handled by other tools or workarounds (e.g. hardlinks, mount/bind, TCP, multiple listen instances, using backend libs directly, resolving symlinks yourself, etc.)
  7. listen's responsibility isn't well defined for files and directories to begin with (it's defined by use cases - which I believe is more than sufficient)
  8. changing the directory structure while listen is running may not provide predictable results
  9. symlinks are mostly relevant only on Linux and OSX - so any "special" symlink handling seems to go against the idea of cross-platform functionality
  10. people should be careful with symlinks and specific about what they expect about handling them - so the less assumptions listen makes about them, the better
  11. performance reasons in some cases (notably - Record building, and relative path building)

Also, only hardlinks are technically transparent (handled by the OS) - symlinks are transparent only if you're not dealing with path processing (which is what Listen does).

@e2
Copy link
Contributor

e2 commented Nov 13, 2014

I've changed my thoughts about symlinks in Listen...

Basically, we can close this issue if this patch gets pulled in: #273

Why? Because it would guarantee that there's max one reference (symlink or not) to a real subdirectory within a watched directory. This makes it almost trivial to resolve back and forth (symlink <-> realpath) if ever needed.

Overall, it's the user's responsibility to avoid "filesystem loops" or even avoid listening to the same physical directories (within a single adapter thread at least) - there's no way Listen can know a user modified something through a symlink or not, while reporting real paths is counter-intuitive.

And feedback is appreciated before I break the world with this patch...

@brauliobo
Copy link

@e2 great!

@akerbos
Copy link
Author

akerbos commented Nov 13, 2014

@e2 Cool, thanks! This sounds like what I was thinking about (let the user care about loops). I'll head over to the other issue for the specifics.

@e2
Copy link
Contributor

e2 commented Nov 18, 2014

I'm closing this since any issue with symlinks will be related to: #274 which needs to be implemented first.

Basically, recursion in Listen prevents implementing smart symlink handling (smart, meaning avoiding loops, avoiding watching the same physical directory multiple times and reporting symlink paths as changed and not the physical ones).

@brauliobo
Copy link

Fixed code to resolve links:

item = "#{File.dirname item}/#{File.readlink(item).gsub /\/$/, ''}" while File.symlink? item

@e2
Copy link
Contributor

e2 commented Nov 27, 2014

After still a few more attempts to wrap my head around this, I believe these 2 feature requests would help make Listen "intuitive" with regards to symlinks: #280 and #279

@lolgear
Copy link

lolgear commented Nov 16, 2015

hey? Is it fixed or not? I want to listen symlink to my json file with localization.
I do:

guard :shell do
  watch(%r{(?<path>^.+?)/localization.json}) do |m|
    n "I AM HERE!" #
    if system("cd #{m[1]} && ruby ios_localization_manager.rb")
      n "#{m[0]} is correct", 'JSON Syntax', :success
    else
      n "#{m[0]} is incorrect", 'JSON Syntax', :failed
    end
  end
end

and nothing happens when I change original file

@e2
Copy link
Contributor

e2 commented Nov 16, 2015

@lolgear - this is a complex issue. And it depends what platform you're on. Or the adapter you're using - if you use polling, it should probably work.

Otherwise, it's a problem, because if you watch symlinked_foo/localization.json, the directory foo has to be watched, which means the even you'll get will be changed: foo/localization.js (not symlinked_foo/localization.json, so the the watch pattern won't match).

Just read my above comment - these need to be implemented: #280 and #279

It's a bit of work (it's like creating another abstraction layer above the filesystem) and few people need this.

The solution: do it the other way around. Watch the real file, and symlink it where other tools need it.

@sampath419
Copy link

sampath419 commented Mar 7, 2017

Modification of symlink is not working in rails listen gem while deploying application using capistrano deployment, every time we generating new files in release directory symlink is pointing to current folder but listen is looking for older directory.
very first time creating symlink working fine but second time modification of symlink listen shows error as

In browser while hitting the server we are getting errir as
"Errno::ENOENT No such file or directory @ realpath_rec - /var/www/releases/20170306..."

listen (3.0.8) lib/listen/adapter/config.rb:17:in realpath' listen (3.0.8) lib/listen/adapter/config.rb:17:in realpath'
listen (3.0.8) lib/listen/adapter/config.rb:17:in block in initialize' listen (3.0.8) lib/listen/adapter/config.rb:16:in map'
listen (3.0.8) lib/listen/adapter/config.rb:16:in `initialize'

Note: cap deploy command for symlink -
execute "cd /var/www/retailrecycle && ln -s ./releases/#{File.basename release_path} current"

Thanks for help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
✨ Feature Adds a new feature
Projects
None yet
Development

No branches or pull requests

16 participants