Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zeitwerk throws a LoadError of cannot load such file for turbo-rails-1.0.0/app/channels/turbo/streams when RAILS_ENV=production is set while using Sidekiq #285

Closed
nickjj opened this issue Dec 18, 2021 · 28 comments
Assignees

Comments

@nickjj
Copy link

nickjj commented Dec 18, 2021

Hi,

Edit: The scope of this issue has changed to affect every Active Job running through Sidekiq and it also affects all environments. The first reply to this issue has more details. I left the original issue here for extra context which is still worth reading.

I recently updated an example Rails app to use Rails 7 (from Rails 6). After doing that Sidekiq will no longer start successfully when RAILS_ENV=production is set but it starts successfully with RAILS_ENV=development.

This gem comes up as a potential culprit but it's too soon to say if it's a bug with this gem. I haven't seen anything similar to this type of error previously (I wasn't using zeitwerk with Rails 6.x).

Here's the turbo-rails related stack trace:

worker_1    | 2021-12-18T15:43:09.829Z pid=1 tid=2ut INFO: Booting Sidekiq 6.3.1 with redis options {:url=>"redis://redis:6379/1"}
worker_1    | /usr/local/bundle/gems/zeitwerk-2.5.1/lib/zeitwerk/loader/helpers.rb:95:in `require': cannot load such file -- /usr/local/bundle/gems/turbo-rails-1.0.0/app/channels/turbo/streams (LoadError)
worker_1    |   from /usr/local/bundle/gems/zeitwerk-2.5.1/lib/zeitwerk/loader/helpers.rb:95:in `const_get'
worker_1    |   from /usr/local/bundle/gems/zeitwerk-2.5.1/lib/zeitwerk/loader/helpers.rb:95:in `cget'
worker_1    |   from /usr/local/bundle/gems/zeitwerk-2.5.1/lib/zeitwerk/loader.rb:239:in `block (2 levels) in eager_load'
worker_1    |   from /usr/local/bundle/gems/zeitwerk-2.5.1/lib/zeitwerk/loader/helpers.rb:26:in `block in ls'
worker_1    |   from /usr/local/bundle/gems/zeitwerk-2.5.1/lib/zeitwerk/loader/helpers.rb:18:in `each_child'
worker_1    |   from /usr/local/bundle/gems/zeitwerk-2.5.1/lib/zeitwerk/loader/helpers.rb:18:in `ls'
worker_1    |   from /usr/local/bundle/gems/zeitwerk-2.5.1/lib/zeitwerk/loader.rb:227:in `block in eager_load'
worker_1    |   from /usr/local/bundle/gems/zeitwerk-2.5.1/lib/zeitwerk/loader.rb:212:in `synchronize'
worker_1    |   from /usr/local/bundle/gems/zeitwerk-2.5.1/lib/zeitwerk/loader.rb:212:in `eager_load'
worker_1    |   from /usr/local/bundle/gems/zeitwerk-2.5.1/lib/zeitwerk/loader.rb:312:in `each'
worker_1    |   from /usr/local/bundle/gems/zeitwerk-2.5.1/lib/zeitwerk/loader.rb:312:in `eager_load_all'
worker_1    |   from /usr/local/bundle/gems/railties-7.0.0/lib/rails/application/finisher.rb:79:in `block in <module:Finisher>'
worker_1    |   from /usr/local/bundle/gems/railties-7.0.0/lib/rails/initializable.rb:32:in `instance_exec'
worker_1    |   from /usr/local/bundle/gems/railties-7.0.0/lib/rails/initializable.rb:32:in `run'
worker_1    |   from /usr/local/bundle/gems/railties-7.0.0/lib/rails/initializable.rb:61:in `block in run_initializers'
worker_1    |   from /usr/local/lib/ruby/3.0.0/tsort.rb:228:in `block in tsort_each'
worker_1    |   from /usr/local/lib/ruby/3.0.0/tsort.rb:350:in `block (2 levels) in each_strongly_connected_component'
worker_1    |   from /usr/local/lib/ruby/3.0.0/tsort.rb:431:in `each_strongly_connected_component_from'
worker_1    |   from /usr/local/lib/ruby/3.0.0/tsort.rb:349:in `block in each_strongly_connected_component'
worker_1    |   from /usr/local/lib/ruby/3.0.0/tsort.rb:347:in `each'
worker_1    |   from /usr/local/lib/ruby/3.0.0/tsort.rb:347:in `call'
worker_1    |   from /usr/local/lib/ruby/3.0.0/tsort.rb:347:in `each_strongly_connected_component'
worker_1    |   from /usr/local/lib/ruby/3.0.0/tsort.rb:226:in `tsort_each'
worker_1    |   from /usr/local/lib/ruby/3.0.0/tsort.rb:205:in `tsort_each'
worker_1    |   from /usr/local/bundle/gems/railties-7.0.0/lib/rails/initializable.rb:60:in `run_initializers'
worker_1    |   from /usr/local/bundle/gems/railties-7.0.0/lib/rails/application.rb:369:in `initialize!'
worker_1    |   from /app/config/environment.rb:5:in `<top (required)>'
worker_1    |   from <internal:/usr/local/lib/ruby/3.0.0/rubygems/core_ext/kernel_require.rb>:148:in `require'
worker_1    |   from <internal:/usr/local/lib/ruby/3.0.0/rubygems/core_ext/kernel_require.rb>:148:in `require'
worker_1    |   from /usr/local/bundle/gems/zeitwerk-2.5.1/lib/zeitwerk/kernel.rb:35:in `require'
worker_1    |   from /usr/local/bundle/gems/sidekiq-6.3.1/lib/sidekiq/cli.rb:273:in `boot_application'
worker_1    |   from /usr/local/bundle/gems/sidekiq-6.3.1/lib/sidekiq/cli.rb:37:in `run'
worker_1    |   from /usr/local/bundle/gems/sidekiq-6.3.1/bin/sidekiq:31:in `<top (required)>'
worker_1    |   from /usr/local/bundle/bin/sidekiq:25:in `load'
worker_1    |   from /usr/local/bundle/bin/sidekiq:25:in `<main>'
hellorails_worker_1 exited with code 1

The application is running within Docker and it's open source at https://github.com/nickjj/docker-rails-example which should provide a reproduceable test case based on the latest main branch. There's a quick getting started guide for the app at https://github.com/nickjj/docker-rails-example#running-this-app which ultimately comes down to running a few commands.

If I exec into the worker container I'm able to stat /usr/local/bundle/gems/turbo-rails-1.0.0/app/channels/turbo/streams which returns back that the directory does exist as seen below:

ruby@97b98bc1dd6b:/app$ stat /usr/local/bundle/gems/turbo-rails-1.0.0/app/channels/turbo/streams
  File: /usr/local/bundle/gems/turbo-rails-1.0.0/app/channels/turbo/streams
  Size: 4096            Blocks: 8          IO Block: 4096   directory
Device: bdh/189d        Inode: 290588      Links: 2
Access: (0755/drwxr-xr-x)  Uid: ( 1000/    ruby)   Gid: ( 1000/    ruby)
Access: 2021-12-18 15:26:58.040000000 +0000
Modify: 2021-12-18 15:26:29.950000000 +0000
Change: 2021-12-18 15:26:55.900000000 +0000
 Birth: 2021-12-18 15:26:55.900000000 +0000

What's weird is the error mentions cannot load such file but that path it not a file, it's a directory on disk.

What's also interesting is the web, worker and cable containers that are a part of the project all use an identical Dockerfile and the only difference is the command that gets run (such as puma or sidekiq). Both the web and cable containers come up fine. It's only worker that fails.

Running the same stat command in the web and cable containers produce the same output.

I'm tracking the issue in my repo at nickjj/docker-rails-example#30.

I'm not sure where else to post this. I'm also not sure how autoloading works under the hood. If paths are loaded alphabetically I'm taking a guess here that other gem paths were successfully loaded and it failed when it got to turbo-rails?

@nickjj
Copy link
Author

nickjj commented Dec 18, 2021

The scope of this issue has changed since its creation.

It's now an issue with RAILS_ENV=development too but only when a job is about to execute through Active Job. Executing a job through Active Job with a :sidekiq back-end will throw this error:

worker_1 | 2021-12-18T18:54:20.325Z pid=1 tid=d35 WARN: LoadError: cannot load such file -- /usr/local/bundle/gems/turbo-rails-1.0.0/app/channels/turbo/streams

Perhaps this has to do with lazy loading modules in development vs pre-loading modules in production? At least it happens in both environments now. Question is whether this is a turbo-rails, sidekiq, Rails 7 or a user issue.

From my end all I did was execute a default Rails 7 mailer through Active Job using deliver_later. It's also worth pointing out if I create a new Sidekiq job in app/workers and execute the job directly with Sidekiq without Active Job then the job runs successfully, so now things are maybe scoped tighter to being related to Active Job.

@gryphon
Copy link

gryphon commented Dec 20, 2021

Having the same issue without zeitwerk

@fxn
Copy link
Contributor

fxn commented Dec 27, 2021

I'll have a look, do you have a minimal app where it can be reproduced?

Let me shed some light about why is a directory being required. You know that if a module acting as a namespace does not have a file defining it, Zeitwerk autovivifies it, like classic did.

In order to do that when the module is first referenced by the user, Zeitwerk defines an autoload on the first directory it finds implicitly defining the module. Then, a thin require wrapper intercepts those ones, autovivifies, executes callbacks, etc.

This is telling us that the autoload is being triggered, but for some reason the internal library registry does not identify it as being managed by the loader. I have never seen this before, we need to understand.

Let me also say that since Zeitwerk has that require decorator installed, it always appears in backtraces. As said before, we need to dig into it to know what's going on, and where.

@nickjj
Copy link
Author

nickjj commented Dec 27, 2021

I'll have a look, do you have a minimal app where it can be reproduced?

The example app at https://github.com/nickjj/docker-rails-example can be used to reproduce it.

As long as you already have Docker installed you can run these steps to get up and running:

git clone https://github.com/nickjj/docker-rails-example hellorails
cd hellorails
cp .env.example .env
cp docker-compose.override.yml.example docker-compose.override.yml
docker-compose build

To reproduce the error you can goto config/environments/development.rb and change config.eager_load = false to config.eager_load = true.

Once that's been adjusted you can run docker-compose up to start the project. You will be greeted with a similar stack trace on the worker container that was posted in my original issue. You can also run any Rails command you would normally do by prefixing them with ./run, for example ./run rails db:setup. The above config change is enough to reproduce it but you could generate a Rails mailer and run that with deliver_later to also reproduce it without eager loading as soon as the mail gets sent.

You can also check the main app (in case you want to create a mailer that gets sent in a controller action) by visiting http://localhost:8000.

This may look like a lot of steps to reproduce it but without Docker you would be responsible for setting up Ruby, Redis, Sidekiq, Foreman, etc. If you already have that available then the same config change should do the trick on your end.

If you have any additional questions please let me know. Happy to include as much information as you request.

Edit: It's worth pointing out this issue persists with Zeitwerk 2.5.1 and 2.5.2.

@Yasser
Copy link

Yasser commented Jan 1, 2022

I can confirm the same error. I'm not sure if it helps, but I've noticed that most of my Sidekiq jobs complete without issue. The ones that fail seem to all be Mailer jobs.

@rwrrll
Copy link

rwrrll commented Jan 3, 2022

I had this exact problem, and resolved it by using bundle exec sidekiq (instead of just sidekiq) to start that process. It looks like you're using the latter form as your Docker Compose command. HTH!

@Yasser
Copy link

Yasser commented Jan 3, 2022

I had this exact problem, and resolved it by using bundle exec sidekiq (instead of just sidekiq) to start that process. It looks like you're using the latter form as your Docker Compose command. HTH!

This appears to have resolved the issue on my development machine.

nickjj added a commit to nickjj/docker-rails-example that referenced this issue Jan 3, 2022
This was reported as a potential fixed based on:
  hotwired/turbo-rails#285 (comment)

The root cause isn't known at the time of making this commit but this
will at least let us run Active Jobs without errors for now.
@nickjj
Copy link
Author

nickjj commented Jan 4, 2022

Nice find @skinnyfit, I can also confirm that using bundle exec sidekiq is working.

I've patched the example Docker Rails app for this at nickjj/docker-rails-example@ed7e99c.

It still feels like maybe the root cause isn't addressed tho since this wasn't an issue previously? I'll keep this issue open until we're greeted with an official answer or it gets closed from a maintainer of turbo-rails.

@arbythree
Copy link

Confirming @skinnyfit solution solves in production. Thank you so much!

@fxn
Copy link
Contributor

fxn commented Jan 5, 2022

Hey, I see the missing bundle exec fixes the problem, and that is how the command should be run. However, I'll try to understand why it didn't work with the original command, just for the sake of it.

In my machine, with a just cloned repo, the error is different:

cannot load such file -- /app/app/channels/application_cable

This is telling us, at least, that the issue is not related to turbo-rails itself. We can leave the issue here by now to keep it simple, however.

I'll write back when I get something more.

@fxn
Copy link
Contributor

fxn commented Jan 5, 2022

OK, I am closer to the root cause, and I have been able to reproduce with a minimal Ruby script that does not depend on Rails.

The command sidekiq loads rails here. That happens before the application boots, and therefore before Bundler is setup in config/boot.rb.

With that loading order, the Kernel#require wrapper installed by Zeitwerk is no longer invoked. Ever. We saw it with this error, but it is not invoked for autoloaded files either. It is either removed by Bundler or totally hijacked.

Why wasn't this happening before? Because in Rails 7 rails.rb loads zeitwerk, since it defines the method Rails.autoloaders. In Rails 6.x loading was artificially delayed as much as possible because if you opted for classic, absolutely everything should be like in Rails 5.

So, the problem is loading zeitwerk before you setup Bundler, and this can be seen in this script:

require 'fileutils'
require 'tmpdir'

# UNCOMMENT THE FOLLOWING LINE TO REPRODUCE.
# require 'zeitwerk'

require 'bundler/inline'
gemfile do
  source 'https://rubygems.org'
  gem 'zeitwerk'
end

Dir.mktmpdir do |dir|
  Dir.chdir(dir)

  FileUtils.mkdir_p('lib/admin')
  File.write('lib/foo.rb', 'Foo = 1')

  loader = Zeitwerk::Loader.new
  loader.push_dir('lib')
  loader.log!
  loader.setup

  p Foo
  p Admin
end

If you run the script as is, you'll see:

  1. Foo is loaded and you see the loader activity logged.
  2. Admin is loaded and you see the loader activity logged.

If you uncomment the require in line 5 (therefore, installing Zeitwerk's require decorator before Bundler enters the game):

  1. Foo is loaded and you DO NOT see the loader activity logged.
  2. Admin is not loaded.

Both (1) and (2) are telling us the decorator is not invoked.

In the case of (1), we see it because the traces are missing. If the application had callbacks, for example, they would not be called.

(2) is what this issue detected, but it could have been (1) as well.

Next step now is to understand who hijacks the require wrapper and why.

@fxn
Copy link
Contributor

fxn commented Jan 5, 2022

Bundler does it here.

Basically, Bundler assumes no one else is decorating require. That does not seem reasonable, Zeitwerk does, Bootsnap does, and this is Ruby anyway.

Bundler is special, however, maybe this needs to be addressed with documentation. I'll open an issue in Bundler to discuss. But I think we can close here: You have to use bundle exec sidekiq because the project is managed by Bundler.

@nickjj
Copy link
Author

nickjj commented Jan 5, 2022

Thank you for the excellent breakdown, thorough explanation and coming up with a good repeatable test case.

I'll close this issue based on everything you wrote.

@nickjj nickjj closed this as completed Jan 5, 2022
@fxn
Copy link
Contributor

fxn commented Jan 5, 2022

Followup.

As discussed, in a project managed by Bundler you need to bundle exec executables, since the operating system could have a different Sidekiq, or a different Rails, than the one in the Gemfile (the require "rails" in Sidekiq could load a different Rails version than the one present in the Gemfile, you know).

However, being pragmatic, I have patched Rails (rails/rails@9b61af8) to delay loading Zeitwerk just a tad, so that a mere require "rails" does not load the gem. You need to instantiate an application. It's all internal, public interface stays the same. That will ship with 7.0.1.

@fxn
Copy link
Contributor

fxn commented Jan 6, 2022

Issue opened rubygems/rubygems#5263 to at least have a conversation about it with the Bundler team.

It's a very edge incompatibility, you need to load Zeitwerk first and then Bundler for it to happen. That's rare, because a project managed by Bundler typically runs executables and loads libraries managed by Bundler. In particular, Bundler loads before Zeitwerk.

@coorasse
Copy link

coorasse commented Apr 21, 2022

I encountered the same issue while working on my own gem.
I switched to zeitwerk and added

loader = Zeitwerk::Loader.for_gem
loader.setup

as usual. But when testing it in my own Rails app, with great surprise I got the error above although I was using bundle exec. I also discovered that the issue happened only when using

gem 'rails_api_logger', path: '../rails_api_logger'

I pushed my branch and changed the Gemfile to

gem 'rails_api_logger', github: 'renuo/rails_api_logger'

and the problem disappeared. Of course, this is anyway very inconvenient, because I cannot point to the local path of my gem, but it might help tackle the issue.

@fxn
Copy link
Contributor

fxn commented Apr 21, 2022

@coorasse which Rails version?

@coorasse
Copy link

Rails version: 7.0.2.3
Bundler: 2.3.4
Ruby 3.1.0

@fxn
Copy link
Contributor

fxn commented Apr 21, 2022

@coorasse Awesome. Would it be possible to share a minimal application that reproduces the issue?

@coorasse
Copy link

Here you are: https://github.com/coorasse/reproduce_zeitwerk_issue
Not very minimal 😅 but it triggers the problem.
I created a Rails 7 app, and just copy-pasted the gem source code inside it.
You can simply run bundle exec rake and get a green test.
If you uncomment the gem in Gemfile, point to the local folder, and run bundle exec rake you'll see the error.

@fxn
Copy link
Contributor

fxn commented Apr 24, 2022

Hey, the branch new_logger does not exist.

If I point to master, and have master locally, bundle exec rake passes both ways.

@coorasse
Copy link

Right, that branch got deleted today. I now pushed a new version where it points to the local path directly and gives me the error when running bundle exec rake:

Error:
TestControllerTest#test_the_truth:
LoadError: cannot load such file -- /Users/coorasse/.rbenv/versions/3.1.0/lib/ruby/gems/3.1.0/gems/turbo-rails-1.0.1/app/channels/turbo/streams
    /Users/coorasse/.rbenv/versions/3.1.0/lib/ruby/gems/3.1.0/gems/bootsnap-1.11.1/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:30:in `require'

@fxn
Copy link
Contributor

fxn commented Apr 24, 2022

Explanation

OK, I see what happens.

Bundler resets Kernel#require the hard way, without taking into consideration that other gems (like Zeitwerk, or Bootsnap) could have their own decorations. Because of this, if you load Zeitwerk first, and then Bundler, or you load it early enough, Zeitwerk decoration is erased by Bundler. This has been reported in rubygems/rubygems#5263.

Why does the local dependency incur in this situation? Because Bundler evaluates its gemspec file. I guess it does so to figure out version and dependencies. The gemspec loads the gem entry point, the entry point loads Zeitwerk, and later Bundler resets Kernel#require.

Workaround

A workaround would be to set the version in its own file and load just the version file in the gemspec, which is a common practice anyway:

# rails_api_logger.gemspec
require_relative "lib/rails_api_logger/version"

Extra Ball

Let me also suggest to ignore the generators directory:

loader.ignore("{__dir__}/generators")

since its files are not meant to be autoloaded and do not follow conventions anyway.

@coorasse
Copy link

Thanks! This makes totally sense!
The project I am working on was running zeitwerk:test and we actually had it fixed already probably, because the latest Gemspec does not require version anymore and we also excluded the generators for the same reason.

I wonder if:

  1. there would be a way to display a warning on the console, because the error is completely unrelated and makes it very hard to tackle down.
  2. should the generators folder always be excluded for gems?

@fxn
Copy link
Contributor

fxn commented Apr 25, 2022

Hey!

we also excluded the generators for the same reason

Excluding from eager loading is not enough. You have to ignore the directory.

If you exclude, you are still telling the loader that it is managing the directory, that the directory is a namespace, and that it should define a constant Generators. The only detail is that you don't want to have it eager loaded. If you inspect Object.constants you'll see Generators there, and it should not be because that directory does not represent a namespace.

On the other hand, ignoring means: you are not managing this at all, ignore it for all purposes. This is what you want.

there would be a way to display a warning on the console, because the error is completely unrelated and makes it very hard to tackle down.

It is unfortunate and difficult to understand, yes. Very rare to hit it also, though. I can't think of a reasonable way to warn that does not imply monkey-patching Bundler, which I think would be a bad idea. Someone else is removing my methods! What we really need is this to be fixed upstream.

should the generators folder always be excluded for gems?

This is a good question.

Take into account that Zeitwerk knows nothing about Rails, generators is not special. All you got is a generic gem. Look at your gem, it is not even defining the conventional namespace. What can you assume? Little.

However, it could be the case that extra directories in lib are not mean to be namespaces most of the time in gems. That would justify documentation/warnings. But then you need interface that fits well to be able to silence the warning if the directory is legit.

I'll think about it.

@dhh dhh assigned fxn Apr 30, 2022
@fxn
Copy link
Contributor

fxn commented May 1, 2022

I have inspected a sample of gems that load with Zeitwerk:

  1. The vast majority of them instantiate their loader with for_gem.
  2. Some have extra Ruby files or directories under lib. Examples: generators, tasks, or core_extensions.
  3. From those, some properly ignore them, but some overlooked doing this.

This, and our thread, tell me that checking for these extras is worthwhile. I have implemented this feature in fxn/zeitwerk@f1c1b59. You get a warning if there are extra non-ignored Ruby files or directories under lib, and the warning itself tells you the options you have to address it.

This issue was already closed, and after this it is double closed 😀.

@fxn
Copy link
Contributor

fxn commented May 1, 2022

@coorasse I have reported your use case in rubygems/rubygems#5263 (comment).

Researching this, I realized that when Bundler evaluates the local gemspec of a path dependency, things are not ready yet. So, any gem loaded as a side-effect of that execution is loaded from the system, and that might not match the one in the Gemfile.lock. This is telling us that cherry-picking a vanilla version file in the gemspec is not only idiomatic, but also the way to play well with how Bundler works. (You hard-coded the string, also aligned with this.)

If the gem loaded from the system has a different version than the one in Gemfile.lock, at some point you'll get an activation conflict and bundle exec will abort. Otherwise, things work by pure chance.

@fxn
Copy link
Contributor

fxn commented Jun 17, 2022

You get a warning if there are extra non-ignored Ruby files or directories under lib

This has shipped with 2.6.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

7 participants