Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use fsnotify for watching file updates #1667

Open
wants to merge 28 commits into
base: master
Choose a base branch
from

Conversation

joelim-work
Copy link
Collaborator

@joelim-work joelim-work commented Mar 31, 2024

Related:

This patch uses the fsnotify library for watching filesystem updates so that the UI can be updated accordingly. This offers an alternative to polling for updates by using the configuration set period 1, and should address many of the issues associated with it.

The feature can be enabled by configuring set watch true. I plan to have it disabled by default as it has not yet been tested thoroughly.

List of potential fixes:

  • File information (i.e. set info size:time) should be updated accordingly
  • Previews should update accordingly
  • Created/renamed/deleted files should be reflected
  • Directory counts (i.e. set dircounts true) should be updated accordingly
  • Copying a large file will show sizes updated in real time
  • Updates from external processes (e.g. shell commands), should be reflected

Important

Currently attribute updates (e.g. chmod) are being processed, although it is discouraged. Perhaps this can be controlled using an option like watchchmod, in which case it is still possible to refresh the UI after performing chmod, using the load command:

cmd chmod %{{
   chmod "$1" $fx
   touch .
   lf -remote "send $id load"
}}

Note

fsnotify may not work properly for remote/FUSE filesystems. If anyone could help verify this, that would be appreciated.


TODO

  • Add documentation for watch option
  • Mention in documentation that remote/FUSE filesystems don't work as it is not covered by fsnotify

@valoq valoq mentioned this pull request Apr 1, 2024
@valoq
Copy link
Contributor

valoq commented Apr 1, 2024

This feature already works pretty well and I think it can merged now after the recent release to have more people test it. I didn't find any issues so far and while it adds a bit more code (and dependency) to lf, the solution looks a lot cleaner and covers more use cases then what I had in mind with #1671.

Note

fsnotify does not support attribute updates (e.g. chmod)

Actually chmod is listed on the fsnotify repo as a supported attribute and after a quick test with this patch it seems to already cover permission changes even without this 'hack'. (tested on ext4)

Note

fsnotify may not work properly for remote/FUSE filesystems. If anyone could help verify this, that would be appreciated.

A quick test on sshfs seems to confirm that fuse is not covered yet.

@joelim-work
Copy link
Collaborator Author

@valoq Thanks - I updated the description. I have included chmod events for now. Also I very briefly tested with sshfs and found it to be unreliable, sometimes it updates but mostly it doesn't.

@valoq
Copy link
Contributor

valoq commented Apr 3, 2024

#1675 addresses the documentation

Remaining issues I noticed so far:

Perhaps the solution is to mix the previous patch that tagged a directory as "updated" with this fsnotify implementation to address all use cases related to timing.

@joelim-work
Copy link
Collaborator Author

Remaining issues I noticed so far:

Perhaps the solution is to mix the previous patch that tagged a directory as "updated" with this fsnotify implementation to address all use cases related to timing.

Hi @valoq do you have steps to reproduce this? And anything helpful like screen recordings?

Seems to cause issues if there are multiple threads and each one of them
is loading the same new file, in which case there is a race condition.
@joelim-work
Copy link
Collaborator Author

joelim-work commented Apr 4, 2024

OK so I found that there are race conditions with mass updates (e.g. touch {1..100} or rm *), so change to use update nav.dirs all in a single thread in 76aacc7.

@valoq
Copy link
Contributor

valoq commented Apr 4, 2024

OK so I found that there are race conditions with mass updates (e.g. touch {1..100} or rm *), so change to use update nav.dirs all in a single thread in 76aacc7.

That seems to address all of the mentioned issues, thanks.

One more issue I noticed (with and without the threading fix) is that copy actions are very slow now.
While the current v32 of lf takes about two seconds on my system to copy a 1GB file from one drive to the other, with the fsnotify patch it takes about a minute and the file size is constantly updating. Running time lf with v32 and again with the fsnotify patch shows about the same results though.

@valoq
Copy link
Contributor

valoq commented Apr 4, 2024

Actually, with 76aacc7 the copy action is fast again, but now the file size is not updated at the end and #1417 is back again

@joelim-work
Copy link
Collaborator Author

Regarding the slowness, AFAIK there's nothing that can be done about this. When you copy a large file, the data is written in blocks, and fsnotify sends an update each time a block is written. There's no way to have fsnotify send only a single update at the end because copying a large file isn't an atomic operation.

Also I'm not sure how why the file size doesn't update for you at the end? For me I tested with the below config, the file size updates at all times during the copy. You'll have to provide the steps for me to reproduce.

set watch true
set info size

@valoq
Copy link
Contributor

valoq commented Apr 4, 2024

Also I'm not sure how why the file size doesn't update for you at the end?

That was just a mistake on my end, I tested the wrong version. The file size is correctly updated with the latest commit.

Regarding the slowness, AFAIK there's nothing that can be done about this. When you copy a large file, the data is written in blocks, and fsnotify sends an update each time a block is written. There's no way to have fsnotify send only a single update at the end because copying a large file isn't an atomic operation.

One possible solution might be to mark a directory as "updating" during copy operations and skip this line until the copy operation is finished, similar to the approach here. Ignoring this watcher event seems to fix the problem.

@joelim-work
Copy link
Collaborator Author

OK this is slightly hacky but I pushed a commit 33f7204 to try and throttle the writes updates. Hopefully this improves the copying time.

@valoq
Copy link
Contributor

valoq commented Apr 5, 2024

Well it does work I suppose.

While it does take significantly less time now, this branch still doubles the execution times for copy operations.
To copy a 1GB file with v32:

user	0m2.067s
sys	0m3.703s

With fsnotify:

user	0m4.594s
sys	0m6.569s

If I read correctly, the update time is currently set to 10 milliseconds, changing this to 100 milliseconds may also work, though I didn't notice any less cpu circles used with that change.
Maybe this is fine as it is, I will test it some time for now.

@joelim-work
Copy link
Collaborator Author

Thanks, I tried 100 milliseconds very briefly as well, the size updates slightly less frequently. But I don't mind changing the time either way. Although I don't think it's worth making this value configurable.

@joelim-work
Copy link
Collaborator Author

joelim-work commented Apr 12, 2024

While not strictly related to this change, one remaining annoyance is that every time a file is renamed, lf redraws with the renamed file at the bottom of the directory and then redraws again with the renamed file at the right position in accordance with the set sorting order. Any idea what causes this? I will open this as a separate issue if I don't find it today.

The rename was designed to be asynchronous, so as a hack it just adds a dummy file at the bottom to represent the new name while it reloads the directory again in the background via app.nav.dirChan.

lf/nav.go

Lines 1563 to 1567 in ce12116

if dir.loading {
dir.files = append(dir.files, &file{FileInfo: lstat})
}
dir.sel(lstat.Name(), nav.height)

This however means that it's possible for the renamed file to appear at the bottom if the screen is redrawn. c695904 should fix this,

@valoq
Copy link
Contributor

valoq commented Apr 13, 2024

This is better, thank you.
I did catch a glimpse of an additional file being listed for a split second after renaming, but I cannot reproduce this anymore.

Can we avoid the redraw of the renamed directory contents? Since the directory name changed, all contents are redrawn even though nothing inside changed.

@joelim-work
Copy link
Collaborator Author

joelim-work commented Apr 14, 2024

Can we avoid the redraw of the renamed directory contents? Since the directory name changed, all contents are redrawn even though nothing inside changed.

I suspect it's not possible without changing the design of rename to run synchronously (i.e. load the new file, update the internal data structures and then redraw) as opposed to asynchronously (i.e. redraw now, then load in the background and then redraw again).

But anyway this flickering looks like it's not relevant to this watch feature so I am not interested in discussing it here, in the interests of avoiding scope creep.

@valoq
Copy link
Contributor

valoq commented Apr 21, 2024

There appears to be a remaining race condition when updating the number of files inside a directory.

Running lf in /tmp and then executing mkdir /tmp/a && touch /tmp/a/b in a different terminal will show directory a as having 0 files even though the file b is listed inside. If the period option is also used, the dircount will get updated eventually, but since the monitor can replace the period option, using both should not be necessary.

@joelim-work
Copy link
Collaborator Author

There appears to be a remaining race condition when updating the number of files inside a directory.

Running lf in /tmp and then executing mkdir /tmp/a && touch /tmp/a/b in a different terminal will show directory a as having 0 files even though the file b is listed inside. If the period option is also used, the dircount will get updated eventually, but since the monitor can replace the period option, using both should not be necessary.

Thanks, I can reproduce this, but I don't know what can be done about it. This kind of race condition occurs because /tmp/a/b is created before lf has a chance to watch /tmp/a. It's impossible to watch /tmp/a before it gets created, and /tmp can't be watched recursively because fsnotify doesn't support it.

TBH I'm not really sure dircounts (and by extension calcdirsize) mix well with directory watching, because this kind of information can only be obtained by watching the directory itself instead of the containing directory. In other words, if you watch a given directory, you can receive updates about files/directories being added/removed, updates about files being modified, but not directories being modified. Do you actually use dircounts, or are you just testing whether it works?

@valoq
Copy link
Contributor

valoq commented Apr 22, 2024

TBH I'm not really sure dircounts (and by extension calcdirsize) mix well with directory watching, because this kind of information can only be obtained by watching the directory itself instead of the containing directory. In other words, if you watch a given directory, you can receive updates about files/directories being added/removed, updates about files being modified, but not directories being modified.

That does make sense, but perhaps we can trigger an update of the directory if a file inside the directory has changed, similar to the original approach with the updated variable I used here. When a file has been created inside a directory the directory could be marked as updated and refresh its content information after the copy operation has finished.

Do you actually use dircounts, or are you just testing whether it works?

I do and frankly I would expect most users to use it since I would consider it a basic feature for a file manager.

I considered for a moment if this issue is relevant enough since it can only be triggered when files are created quickly after its parent directory is, but I noticed the issue is also present when copying a directory with files inside. The newly copied directory will always be listed as having 0 files. There were also cases where a directory indicated having files inside after all files inside had been deleted, but I was not yet able to recreate that so it may be a separate issue, though I suspect it to be related.

@joelim-work
Copy link
Collaborator Author

That does make sense, but perhaps we can trigger an update of the directory if a file inside the directory has changed, similar to the original approach with the updated variable I used here. When a file has been created inside a directory the directory could be marked as updated and refresh its content information after the copy operation has finished.

I don't think an updated flag is necessary since the directory is modified anyway when a new file is created. Anyway I changed the code to use this approach (i.e. actually reloading the directory via app.nav.renew after a small delay). It's kind of hacky but I don't know what else works if fsnotify can't watch directories recursively.

I do and frankly I would expect most users to use it since I would consider it a basic feature for a file manager.

It's very easy for a user to say this (especially if it their preference), but the fact remains that dircounts is an awkward feature to implement because each pane represents a directory, and any dircounts shown in that pane does not actually relate to that directory itself. This causes problems with the load approach from #776, and also with fsnotify because of they way it notifies about file creations. I'm not saying it's impossible to update dircounts from fsnotify, but it is very fiddly and requires some hacks to get working. I'm worried that if this patch becomes too complex then @gokcehan will decide it is not worth merging.

@valoq
Copy link
Contributor

valoq commented Apr 23, 2024

That does make sense, but perhaps we can trigger an update of the directory if a file inside the directory has changed, similar to the original approach with the updated variable I used here. When a file has been created inside a directory the directory could be marked as updated and refresh its content information after the copy operation has finished.

I don't think an updated flag is necessary since the directory is modified anyway when a new file is created. Anyway I changed the code to use this approach (i.e. actually reloading the directory via app.nav.renew after a small delay). It's kind of hacky but I don't know what else works if fsnotify can't watch directories recursively.

When copying directories with multiple files, the copied directory is now listed with a partial file count. Strangely its appears to be the same wrong number. Probably because of the timing?

If I copy the lf source code to lf2 (using cp) the original directory is listed with 44 files and the copy is always listed with 41 until I hit refresh manually.

I do and frankly I would expect most users to use it since I would consider it a basic feature for a file manager.

dircounts is an awkward feature to implement

It got this impression with a lot of the code and I am wondering if the only solid solution is to rewrite large parts of lf with a different approach to file updates. Not that I would find the time to do this either :)

So far this patch does not look that complex, especially when considering how many bugs it fixes.

@valoq
Copy link
Contributor

valoq commented Apr 23, 2024

Setting the refresh time to 100 milliseconds seems to solve the remaining issue and I was not able to recreate the problem with directories with different file numbers or file sizes.

@joelim-work
Copy link
Collaborator Author

When copying directories with multiple files, the copied directory is now listed with a partial file count. Strangely its appears to be the same wrong number. Probably because of the timing?

If I copy the lf source code to lf2 (using cp) the original directory is listed with 44 files and the copy is always listed with 41 until I hit refresh manually.

I tried copying the lf source code and got the correct count, so I'm quite sure it's a timing issue.

Setting the refresh time to 100 milliseconds seems to solve the remaining issue and I was not able to recreate the problem with directories with different file numbers or file sizes.

I can change the refresh times if you want. But there's two refresh times now, one for reloading directories, and another for updating a file within an existing directory. Which one did you want to change to 100 milliseconds?

It got this impression with a lot of the code and I am wondering if the only solid solution is to rewrite large parts of lf with a different approach to file updates. Not that I would find the time to do this either :)

So far this patch does not look that complex, especially when considering how many bugs it fixes.

Maybe the code looks a bit better now since I rewrote it a bit. But anyway when I mention the complexity, I'm not just talking about the patch itself but all the potential pitfalls involved when using fsnotify. For instance, the fact that there is no way to receive updates for files created in a newly created directory, so I have to resort to refreshing after some arbitrary delay (these kind of solutions feels somewhat hacky to me). As for rewriting large parts of lf to better handle file updates, I'm not what you have in mind here.

@valoq
Copy link
Contributor

valoq commented Apr 23, 2024

Setting the refresh time to 100 milliseconds seems to solve the remaining issue and I was not able to recreate the problem with directories with different file numbers or file sizes.

I can change the refresh times if you want. But there's two refresh times now, one for reloading directories, and another for updating a file within an existing directory. Which one did you want to change to 100 milliseconds?

Setting this line to 100 milliseconds solved the issue for me

Without this change, I also noticed the dircounts of copied directories were sometimes wrong and in some cases several directories inside the copied directories were not listed at all until a manual refresh.

Not sure why, but all those issues vanished when the timer in addUpdate is set to 100

@joelim-work
Copy link
Collaborator Author

It's pretty much the same issue as #1667 (comment). With a smaller delay, updates for the same file are sent more frequently, and an older update may overwrite a newer one.

I ended up changing both delays to 100 milliseconds, just to keep things consistent. If it doesn't work for you then I'll revert the change for the load delay.

@valoq
Copy link
Contributor

valoq commented Apr 24, 2024

After f9a51ad writing a file in the listed directory outside of lf causes previews to be redrawn resulting in flickering.

In some cases opening a terminal was enough to trigger the redraw, alternatively writing to any file in the filesystem using vim appears to do the same, even when the written file is on another partition entirely. This may be due to new temporary files being created by vim and sometimes by terminals.

@joelim-work
Copy link
Collaborator Author

joelim-work commented Apr 25, 2024

After f9a51ad writing a file in the listed directory outside of lf causes previews to be redrawn resulting in flickering.

In some cases opening a terminal was enough to trigger the redraw, alternatively writing to any file in the filesystem using vim appears to do the same, even when the written file is on another partition entirely. This may be due to new temporary files being created by vim and sometimes by terminals.

I can reproduce this flickering, but only for the following additional conditions:

  1. Files are added/removed from the current directory, and not other directories
  2. A previewer script is enabled
  3. The previewer script has an exit code of 1, which means the preview will always be redrawn even if the file hasn't changed (useful for images)

But this is a separate bug that already exists, I can reproduce with set period 1 as well. It looks like there was some attempt to reduce preview flickering in #546, but perhaps the implementation isn't complete and I don't intend to deal with it in this PR.

@joelim-work
Copy link
Collaborator Author

OK I ended up fixing this flicker in #1697

@valoq
Copy link
Contributor

valoq commented Apr 25, 2024

OK I ended up fixing this flicker in #1697

That addressed the problem, thank you.

@joelim-work
Copy link
Collaborator Author

joelim-work commented Apr 27, 2024

Hi @valoq, I pushed a new commit b4eb749 to experiment with using a separate dedicated goroutine to handle all the file/directory reloads (also to move as much code into watch.go as possible). This should hopefully fix the race condition issue resulting in partial file counts mentioned in #1667 (comment), and I also changed the delay time back to 10 milliseconds.

Let me know if it causes any issues, I might reverting the delay back to 100 milliseconds again, or end up reverting the entire commit.

@valoq
Copy link
Contributor

valoq commented Apr 27, 2024

That seems to fix the race condition as well, thanks.
I will try to do some more tests to trigger possible issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants