Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hot-reload not working with k3d filesystem on macOS (M1): "Bad file descriptor" #1476

Closed
yanchogeorgiev opened this issue Aug 8, 2022 · 8 comments · Fixed by #1964
Closed
Assignees

Comments

@yanchogeorgiev
Copy link

yanchogeorgiev commented Aug 8, 2022

Describe the bug
Hot-reload is not working on MacOS (M1). The router is unable to start.
Tested the same setup on Ubuntu and seems to work without any issue.

To Reproduce
Steps to reproduce the behavior:

  1. Install k3d and tilt
  2. Load configuration from this example: https://www.apollographql.com/docs/router/containerization/kubernetes/
  3. The router is unable to start

Expected behavior
The router should works as expected.

Output

{"timestamp":"2022-08-08T10:53:02.456187Z","level":"INFO","fields":{"message":"Apollo Router v0.14.0 // (c) Apollo Graph, Inc. // Licensed as ELv2 ([https://go.apollo.dev/elv2)"},"target":"apollo_router::executable"}](https://go.apollo.dev/elv2)%22%7D,%22target%22:%22apollo_router::executable%22%7D)
thread 'main' panicked at 'Failed to initialise file watching.: Notify(Io(Os { code: 9, kind: Uncategorized, message: "Bad file descriptor" }))', apollo-router/src/files.rs:22:14
stack backtrace:
   0:       0x40023b8a8d - std::backtrace_rs::backtrace::libunwind::trace::h22893a5306c091b4
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/../../backtrace/src/backtrace/libunwind.rs:93:5
   1:       0x40023b8a8d - std::backtrace_rs::backtrace::trace_unsynchronized::h29c3bc6f9e91819d
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
   2:       0x40023b8a8d - std::sys_common::backtrace::_print_fmt::he497d8a0ec903793
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/sys_common/backtrace.rs:66:5
   3:       0x40023b8a8d - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h9c2a9d2774d81873
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/sys_common/backtrace.rs:45:22
   4:       0x40023ddaac - core::fmt::write::hba4337c43d992f49
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/fmt/mod.rs:1194:17
   5:       0x40023b1b21 - std::io::Write::write_fmt::heb73de6e02cfabed
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/io/mod.rs:1655:15
   6:       0x40023ba8f5 - std::sys_common::backtrace::_print::h63c8b24acdd8e8ce
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/sys_common/backtrace.rs:48:5
   7:       0x40023ba8f5 - std::sys_common::backtrace::print::h426700d6240cdcc2
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/sys_common/backtrace.rs:35:9
   8:       0x40023ba8f5 - std::panicking::default_hook::{{closure}}::hc9a76eed0b18f82b
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panicking.rs:295:22
   9:       0x40023ba5a9 - std::panicking::default_hook::h2e88d02087fae196
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panicking.rs:314:9
  10:       0x40023bae42 - std::panicking::rust_panic_with_hook::habfdcc2e90f9fd4c
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panicking.rs:698:17
  11:       0x40023bad27 - std::panicking::begin_panic_handler::{{closure}}::he054b2a83a51d2cd
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panicking.rs:588:13
  12:       0x40023b8f44 - std::sys_common::backtrace::__rust_end_short_backtrace::ha48b94ab49b30915
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/sys_common/backtrace.rs:138:18
  13:       0x40023baa59 - rust_begin_unwind
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panicking.rs:584:5
  14:       0x40002bf163 - core::panicking::panic_fmt::h366d3a309ae17c94
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/panicking.rs:143:14
  15:       0x40002bf253 - core::result::unwrap_failed::hddd78f4658ac7d0f
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/result.rs:1785:5
  16:       0x40008f94e0 - apollo_router::files::watch::h97b97d82fdad84f1
  17:       0x400051eac6 - apollo_router::router::ApolloRouter::serve::h3c5f8c36263d6ab0
  18:       0x40004432c9 - <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll::hefc342c9ce0bf595
  19:       0x400082e1dd - std::thread::local::LocalKey<T>::with::h250e38da0c18df15
  20:       0x4000999fca - tokio::park::thread::CachedParkThread::block_on::h86bf9862cf0a7622
  21:       0x40006f6f57 - tokio::runtime::thread_pool::ThreadPool::block_on::hc56a09f1564bc964
  22:       0x40005eeadd - tokio::runtime::Runtime::block_on::hf00e84efa2076f11
  23:       0x4000344836 - apollo_router::executable::main::h53f2296129dbd07f
  24:       0x40002bfb0b - router::main::h479ae5fda6e8c43a
  25:       0x40002bfae3 - std::sys_common::backtrace::__rust_begin_short_backtrace::h28cdcd0d5b402e52
  26:       0x40002bfab9 - std::rt::lang_start::{{closure}}::h13104d833b4cbf97
  27:       0x40023ab3de - core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once::had4f69b3aefb47a8
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/ops/function.rs:259:13
  28:       0x40023ab3de - std::panicking::try::do_call::hf2ad5355fcafe775
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panicking.rs:492:40
  29:       0x40023ab3de - std::panicking::try::h0a63ac363423e61e
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panicking.rs:456:19
  30:       0x40023ab3de - std::panic::catch_unwind::h18088edcecb8693a
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panic.rs:137:14
  31:       0x40023ab3de - std::rt::lang_start_internal::{{closure}}::ha7dad166dc711761
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/rt.rs:128:48
  32:       0x40023ab3de - std::panicking::try::do_call::hda0c61bf3a57d6e6
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panicking.rs:492:40
  33:       0x40023ab3de - std::panicking::try::hbc940e68560040a9
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panicking.rs:456:19
  34:       0x40023ab3de - std::panic::catch_unwind::haed0df2aeb3fa368
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panic.rs:137:14
  35:       0x40023ab3de - std::rt::lang_start_internal::h9c06694362b5b80c
                               at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/rt.rs:128:20
  36:       0x40002bfbc2 - main
  37:       0x4005ac0d0a - __libc_start_main
  38:       0x40002bf9ee - _start
  39:                0x0 - <unknown>

Desktop (please complete the following information):

  • OS: macOS Monterey
  • Version 12.5

Additional context
Maybe this is related: notify-rs/notify#282

Additional question
Is there a way to manually reload the supergraph schema without restart?

@garypen
Copy link
Contributor

garypen commented Aug 8, 2022

I'm going to rename the issue, because the problem is caused by the filesystem your are running in k3d. Reload works fine on OS X on bare metal or docker.

@garypen garypen changed the title Unable to start router with hot-reload on macOS (M1): "Bad file descriptor" hot-reload not working with k3d filesystem on macOS (M1): "Bad file descriptor" Aug 8, 2022
@garypen
Copy link
Contributor

garypen commented Aug 8, 2022

I've verified this. Not sure what the fix is right now.

@garypen
Copy link
Contributor

garypen commented Aug 8, 2022

workaround: don't specify --hot-reload, use APOLLO_GRAPH_REF and APOLLO_KEY to download your supergraph from studio.

@tcarrio
Copy link

tcarrio commented Aug 26, 2022

I have confirmed this issue is also impacting me on an M1 Macbook using Docker Desktop.

@tcarrio
Copy link

tcarrio commented Aug 26, 2022

Using the command:

docker run --rm -it \
    -p 8085:8085 \
    -v $(pwd)/schema.graphql:/dist/config/schema.graphql \
    -v $(pwd)/router.yaml:/dist/config/router.yaml \
    api-router

If I then update my Docker image to include the --hot-reload flag in the CMD I get the following error:

2022-08-26T17:40:31.857293Z ERROR apollo_router::executable: panicked at 'Failed to initialise file watching.: Notify(Io(Os { code: 9, kind: Uncategorized, message: "Bad file descriptor" }))', apollo-router/src/files.rs:22:14

@tcarrio
Copy link

tcarrio commented Aug 26, 2022

Additionally running the macOS binary directly is successful:

2022-08-26T17:44:40.595993Z  INFO apollo_router::axum_http_server_factory: GraphQL endpoint exposed at http://0.0.0.0:8085/api/graphql 🚀

@kmcrawford
Copy link

This looks to be a known problem with the hotwatch library being used, a wrapper for the notify library. All recommendations say to switch to PollWatcher. Doing this switch might also fix issue #1695 for watching files it does not own.

https://docs.rs/notify/latest/notify/#known-problems

garypen added a commit that referenced this issue Oct 19, 2022
fixes: #1476  
fixes: #1695

The change replaces `hotwatch` with `notify` and makes use of the
"recommended" watcher feature which attempts to pick a file watcher
which is best suited to the execution environment. I've also improved
the robustness of the test and the way in which events are conveyed from
the file watcher to the consumer (retries on full to prevent lost event
notifications).

I've tested this manually on my local system (Macos M1) and confirmed:
 - test suite works
 - docker with --hot-reload works
 - kubernetes in k3d works*    

* Note: [Limitations in kubernetes configmap
handling](https://kubernetes.io/docs/concepts/configuration/configmap/#mounted-configmaps-are-updated-automatically)
mean that router pods in kubernetes won't react to changes to configmaps
mounted to containers which specify a subPath (which the router helm
chart currently does). Changes to schema will trigger a hot-reload.

As a drive-by: I've made a slight modification to the helm chart which
removes the use of subPath. If we don't specify the subPath and mount
the chart at /app/ we get the same result but with the added benefit of
configMap updates by k8s.
@abernix abernix removed the triage label Nov 7, 2022
@boggsey
Copy link

boggsey commented Nov 11, 2022

I think this is related but happy to open a new issue. On version 1.3.0 I'm unable to run the following command. APOLLO_ROUTER_HOT_RELOAD=true causes it to panic and APOLLO_ROUTER_HOT_RELOAD=false will allow it to run.

docker run -p 4000:4000 \
  --env APOLLO_GRAPH_REF="<your graph>" \
  --env APOLLO_KEY="<your key>" \
  --env APOLLO_ROUTER_HOT_RELOAD=true \
  --mount "type=bind,source=</PATH/TO>/router.yaml,target=/dist/config/router.yaml" \
  --rm \
  ghcr.io/apollographql/router:v1.3.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants