Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kqueue: wait without timeout #124

Closed
wants to merge 1 commit into from
Closed

Conversation

fjl
Copy link
Contributor

@fjl fjl commented Mar 4, 2016

As noted in #24, reading from the kqueue won't unblock when
it is closed on all platforms. This commit solves the issue by
additionally watching on a pipe which is closed when Watcher
is closed. This makes the read timeout unnecessary.

@fjl fjl force-pushed the kqueue-no-timeout branch 3 times, most recently from 2a1919d to baf2550 Compare March 4, 2016 16:35
@fjl
Copy link
Contributor Author

fjl commented Mar 4, 2016

I cannot reproduce the failure on Travis and it seems unrelated to the PR.

@fjl
Copy link
Contributor Author

fjl commented Mar 4, 2016

The tests pass on FreeBSD, too.

@nathany
Copy link
Contributor

nathany commented Mar 4, 2016

I'll re-run Travis to see if it's being flaky.

@fjl
Copy link
Contributor Author

fjl commented Mar 4, 2016

Looks like it worked this time.

@fjl
Copy link
Contributor Author

fjl commented Mar 5, 2016

I found a minor issue with how the pipe is set up and fixed it. Travis still passing.

@fjl
Copy link
Contributor Author

fjl commented Mar 11, 2016

Any ETA on getting this merged?

@nathany
Copy link
Contributor

nathany commented Mar 15, 2016

Sorry, still haven't reviewed and tested it yet.

@pquerna
Copy link

pquerna commented Apr 14, 2016

I was also seeing Close() hangs as described in #89 -- at $job we are now carrying this patch on our vendor'ed copy of fsnotify, would love to see it merged.

@nathany
Copy link
Contributor

nathany commented Apr 15, 2016

I'm not that familiar with syscall.Pipe. Hoping someone will review this.

As noted in fsnotify#24, reading from the kqueue won't unblock when
it is closed on all platforms. This commit solves the issue by
additionally watching on a pipe which is closed when Watcher
is closed. The read timeout is no longer necessary.
@fjl
Copy link
Contributor Author

fjl commented Apr 21, 2016

rebased

@fjl
Copy link
Contributor Author

fjl commented Nov 9, 2016

Is there anything I can do to get this merged?

@nathany
Copy link
Contributor

nathany commented Nov 10, 2016

Thanks for your patience @fjl.

There is currently a data race issue with kqueue on master, which should be remedied first.
#162 (comment)

As to this PR, we need some familiar with epoll to do the code review.

@nathany nathany mentioned this pull request Mar 14, 2017
@arp242
Copy link
Member

arp242 commented Jul 21, 2022

A lot of things changed since this PR was opened and it doesn't rebase cleanly with quite a few conflicts. It's not entirely clear to me if this problem has been solved since 2014 or if it still exists.

Either way, I think it would be good to have a test case; I ran go test -race -count=100 on FreeBSD (13) and that runs without problems, but maybe it only happens on macOS?

@nathany
Copy link
Contributor

nathany commented Jul 22, 2022

@arp242 GitHub Actions will run on macOS, if there is a useful test case, but not sure about running such a high count.

On macOS 12.4:

❯ go test -race -count=100
panic: test timed out after 10m0s

@arp242
Copy link
Member

arp242 commented Jul 22, 2022

Oh, you'll have to add -timeout=20m or some such; the timeout applies to the entire test run, not a single test, so if that takes too long it will time out.

GitHub Actions will run on macOS, if there is a useful test case, but not sure about running such a high count.

Yeah, just a quick low-effort way to see if intermittent problems show up; if they do, a dedicated (non-slow) test case can be added.

@nathany
Copy link
Contributor

nathany commented Jul 22, 2022

This is the result running another time on macOS 12.5:

❯ go test -race -count=100 -timeout=20m
--- FAIL: TestFsnotifyDeleteWatchedDir (0.52s)
    integration_test.go:460: error received: fdopendir /var/folders/82/9wkthct972xb6h3nwsx2rfnh0000gn/T/fsnotify1522497754: no such file or directory
    integration_test.go:471: event received: "/var/folders/82/9wkthct972xb6h3nwsx2rfnh0000gn/T/fsnotify1522497754/TestFsnotifyEventsExisting.testfile": REMOVE
    integration_test.go:471: event received: "/var/folders/82/9wkthct972xb6h3nwsx2rfnh0000gn/T/fsnotify1522497754": REMOVE
FAIL
exit status 1
FAIL	github.com/fsnotify/fsnotify	886.645s

@arp242
Copy link
Member

arp242 commented Jul 22, 2022

Coolio, I can't reproduce that on FreeBSD (just tried again to be sure), but it confirms that the problem still exists, and that it's specific to macOS and not all of kqueue (probably).

So need to rebase and write a decent test case which doesn't take 15 minutes, probably something similar to this.

@arp242
Copy link
Member

arp242 commented Jul 22, 2022

#262 seems to do the same btw, might want to go with that one as it seems a bit simpler.

arp242 added a commit that referenced this pull request Aug 1, 2022
The timeout for unix.Kevent() is causing issues; every 100ms it will do a new
unix.Kevent() syscall, which isn't too efficient: even if you have just one
change an hour, you will still keep calling kevent() ten times per second,
resulting in a needlessly high CPU usage.

Without a timeout, kevent() will block indefinitely until there are some events,
which is much more efficient. We can't just remove the timout however, since we
can't interrupt the kevent() call on FreeBSD and NetBSD, and it will hang
forever. This issue is described in more detail here:
#262 (comment)

To solve this we register a new kevent() with the file descriptor set to the
closepipe; when this pipe is closed an event is sent and kevent() will stop
blocking.

This is a rebased version of #124.

Fixes #89
Fixes #237
Fixes #333
Supersedes and closes #124
Supersedes and closes #262
Supersedes and closes #334
@arp242
Copy link
Member

arp242 commented Aug 1, 2022

New version of this PR: #480

I couldn't update this PR for some reason; I guess the branch/fork disappeared after all these years? Anyway, it's essentially the same patch, but rebased on the latest main and with some minor cleanups.

@arp242 arp242 closed this Aug 1, 2022
arp242 added a commit that referenced this pull request Aug 6, 2022
The timeout for unix.Kevent() is causing issues; every 100ms it will do a new
unix.Kevent() syscall, which isn't too efficient: even if you have just one
change an hour, you will still keep calling kevent() ten times per second,
resulting in a needlessly high CPU usage.

Without a timeout, kevent() will block indefinitely until there are some events,
which is much more efficient. We can't just remove the timout however, since we
can't interrupt the kevent() call on FreeBSD and NetBSD, and it will hang
forever. This issue is described in more detail here:
#262 (comment)

To solve this we register a new kevent() with the file descriptor set to the
closepipe; when this pipe is closed an event is sent and kevent() will stop
blocking.

This is a rebased version of #124.

Fixes #89
Fixes #237
Fixes #333
Supersedes and closes #124
Supersedes and closes #262
Supersedes and closes #334

Co-authored-by: Felix Lange <fjl@twurst.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants