Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zeitwerk broken by Bundler #104

Closed
Aquaj opened this issue Jan 7, 2020 · 9 comments
Closed

Zeitwerk broken by Bundler #104

Aquaj opened this issue Jan 7, 2020 · 9 comments

Comments

@Aquaj
Copy link

Aquaj commented Jan 7, 2020

Hi, I think Zeitwerk's gets messed up by Bundler's reverse_rubygems_kernel_mixin internal method.

I'll do my best to explain the initial issue and what I've found so far.
On the app I'm working on we had an issue where Karafka wouldn't fail on launch (launched with bundle exec foreman start karafka), raising a LoadError - file not found 'app/channels/application_cable'.
As the path is a folder and not an actual ruby file, it seems like it should've been autovivified instead of being required by the app. So I went digging some more and it appears that even though the require method defined in lib/zeitwerk/kernel.rb is called at the beginning of the application initialization, as soon as the app reaches config/application.rb (more specifically Bundler.require(*Rails.group)), Bundler tries to setup, which includes a call to reverse_rubygems_kernel_mixin, resetting Kernel#require to its original version and overwriting Zeitwerk's modifications to it.

I don't really know how to fix it, and I don't know why we only encounter it when launching our karafka process and not on the rails server one, but I'm open to ideas on how we could solve this.

For now we had to resort to defaulting back on the :classic loader, which I'm not a big fan of but does the work.

I also have opened an issue on Bundler's repository (bundler/bundler#7553).


UPDATE

The issue was ultimately fixed in Bundler. So if you have the issue, try to update it.

If you can't update, here are some workarounds we'd found before:

  • Falling back to the :classic autoloader (config.autoloader = :classic in config/application.rb)
  • Using a binstub (bundle binstub <library>) in the Procfile for the library instead of the bundle exec call (service: bin/<library> instead of service: bundle exec <library>)
@fxn
Copy link
Owner

fxn commented Jan 7, 2020

That is strange, Zeitwerk routinely works with Bundler in Rails applications and non-Rails Ruby projects.

Would it be possible for you to provide a minimal application that reproduces this?

@Aquaj
Copy link
Author

Aquaj commented Jan 7, 2020

I found it strange too.
As mentioned it's not happening with the server process, which part of what's making this so confusing.

I'll try to set up a minimal app replicating the issue during the week.

@Aquaj
Copy link
Author

Aquaj commented Jan 9, 2020

I set up a replicating app with as little as I could (by shaving away from my problem-encountering app). https://github.com/Aquaj/broken-zeitwerk-minimal-ex

As I worked on it I figured a few more things:

  • It happens (apparently) because of a require "config/environment" in Karafka's boot file
  • Without foreman it doesn't happen
  • With a karafka binstub (instead of bundle exec) it doesn't happen
  • Without Karafka (having foreman run bundle exec ruby karafka.rb) it doesn't happen

It seems to stem from a cocktail of libraries (Karafka, Foreman, Zeitwerk, Bundler) so it's likely a fix could be found in any of them, but since retrograding to the :classic autoloader fixes the issue, it still would be a regression for Zeitwerk.

I'll try to look into the issue more now that we have a minimal replicating case. There's no urgency to finding a fix though, as there are many ways to go around the issue.

@fxn
Copy link
Owner

fxn commented Jan 9, 2020

Would it be possible for you to inspect

Object.method(:require).source_location

in a few places to see it changed?

For example, in a vanilla shell you get this:

% irb
irb(main):001:0> Object.method(:require).source_location
=> ["/Users/fxn/.asdf/installs/ruby/2.6.5/lib/ruby/2.6.0/rubygems/core_ext/kernel_require.rb", 34]
irb(main):002:0> require "zeitwerk"
=> true
irb(main):003:0> Object.method(:require).source_location
=> ["/Users/fxn/.asdf/installs/ruby/2.6.5/lib/ruby/gems/2.6.0/gems/zeitwerk-2.2.2/lib/zeitwerk/kernel.rb", 13]

@Aquaj
Copy link
Author

Aquaj commented Jan 9, 2020

Sure !
I added the logging wherever I thought it might be useful. Then some more around the require-modifying places inside gems (ActiveSupport, Bootsnap).

I also added this snippet at the very beginning of the foreman binstub:

module Kernel
  def self.method_added(method_name)
    return unless method_name.to_s == 'require'
    puts "== REDEFINITION OF REQUIRE in Kernel"
    caller.first(3).each { |callsite| puts ' - ' + callsite }
    puts "=="
  end
end

And here are the results

=[BINSTUB]= ["/Users/aquaj/.rbenv/versions/2.6.5/lib/ruby/site_ruby/2.6.0/rubygems/core_ext/kernel_require.rb", 34]
== REDEFINITION OF REQUIRE in Kernel
 - /Users/aquaj/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/bundler-2.0.2/lib/bundler/rubygems_integration.rb:605:in `define_method'
 - /Users/aquaj/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/bundler-2.0.2/lib/bundler/rubygems_integration.rb:605:in `redefine_method'
 - /Users/aquaj/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/bundler-2.0.2/lib/bundler/rubygems_integration.rb:355:in `block in reverse_rubygems_kernel_mixin'
==
karafka.1 | started with pid 5721
karafka.1 | =[ZEITWERK/PRE-MODULE]= nil
karafka.1 | =[ZEITWERK/POST-MODULE]= ["/Users/aquaj/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/zeitwerk-2.2.1/lib/zeitwerk/kernel.rb", 14]
karafka.1 | =[KARAFKA]= ["/Users/aquaj/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/zeitwerk-2.2.1/lib/zeitwerk/kernel.rb", 14]
karafka.1 | =[ENVIRONMENT/PRE-REQ_REL]= ["/Users/aquaj/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/zeitwerk-2.2.1/lib/zeitwerk/kernel.rb", 14]
karafka.1 | =[APPLICATION/PRE-REQ_REL]= ["/Users/aquaj/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/zeitwerk-2.2.1/lib/zeitwerk/kernel.rb", 14]
karafka.1 | =[BOOT/PRE-REQ_BUNDLER]= ["/Users/aquaj/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/zeitwerk-2.2.1/lib/zeitwerk/kernel.rb", 14]
karafka.1 | =[BOOT/POST-REQ_BUNDLER]= ["/Users/aquaj/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/zeitwerk-2.2.1/lib/zeitwerk/kernel.rb", 14]
karafka.1 | =[BOOTSNAP/PRE-MODULE]= ["/Users/aquaj/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/zeitwerk-2.2.1/lib/zeitwerk/kernel.rb", 14]
karafka.1 | =[BOOTSNAP/POST-MODULE]= ["/Users/aquaj/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/bootsnap-1.4.5/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb", 27]
karafka.1 | =[ACTIVESUPPORT/PRE-LOADABLE]= ["/Users/aquaj/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/bootsnap-1.4.5/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb", 27]
karafka.1 | =[ACTIVESUPPORT/POST-LOADABLE]= ["/Users/aquaj/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/activesupport-6.0.0/lib/active_support/dependencies.rb", 323]
karafka.1 | =[BOOT/POST-REQ_BOOTSNAP]= ["/Users/aquaj/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/activesupport-6.0.0/lib/active_support/dependencies.rb", 323]
karafka.1 | =[APPLICATION/PRE-REQ]= ["/Users/aquaj/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/activesupport-6.0.0/lib/active_support/dependencies.rb", 323]
karafka.1 | =[APPLICATION/POST-REQ]= ["/Users/aquaj/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/activesupport-6.0.0/lib/active_support/dependencies.rb", 323]
karafka.1 | =[ENVIRONMENT/POST-REQ_REL]= ["/Users/aquaj/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/activesupport-6.0.0/lib/active_support/dependencies.rb", 323]
karafka.1 | bundler: failed to load command: karafka (/Users/aquaj/.rbenv/versions/2.6.5/bin/karafka)
karafka.1 | LoadError: cannot load such file -- /Users/aquaj/Code/external/broken-zeitwerk-minimal-ex/app/channels/application_cable

@Aquaj
Copy link
Author

Aquaj commented Jan 9, 2020

So I guess the flow is:

  • Initial require in Ruby's implementation
  • Rubygems (already present at the start of the binstub)
  • Revert by Bundler (Redefinition)
  • Zeitwerk (nil [original]=> zeitwerk/kernel.rb)
  • Bootsnap (zeitwerk/kernel.rb=> bootsnap/load_path_cache/core_ext/kernel_require.rb)
  • ActiveSupport (bootsnap/load_path_cache/core_ext/kernel_require.rb => active_support/dependencies.rb)

ActiveSupport modifies it by including a module which then calls super, so ultimately it defaults to Bootsnap's implementation.
And Bootsnap renames the original require (into require_without_bootsnap) and uses it in its own implementation, so it should eventually fallback to Zeitwerk, according to this apparent flow.

So by the last puts it should still be working, which means it's something within Rails.application.initialize! that mess it up.


Editing my own message to avoid spamming comments.
I added some logging inside initialize, then in the run_initializers loop and it's definitely during the let_zeitwerk_take_over initializer that it sets back to the original implementation.

...
karafka.1 | =[WARN_IF_AUTOLOADED/POST-RUN]= ["/Users/aquaj/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/activesupport-6.0.0/lib/active_support/dependencies.rb", 323]
karafka.1 | =[LET_ZEITWERK_TAKE_OVER/PRE-RUN]= ["/Users/aquaj/.rbenv/versions/2.6.5/lib/ruby/gems/2.6.0/gems/activesupport-6.0.0/lib/active_support/dependencies.rb", 323]
karafka.1 | =[LET_ZEITWERK_TAKE_OVER/POST-RUN]= nil
karafka.1 | =[ADD_BUILTIN_ROUTE/PRE-RUN]= nil
...

Going further down I can see it go from:

  • ActiveSupport::Dependencies::ZeitwerkIntegration::Decorations.decorate_dependencies, to
  • ActiveSupport::Dependencies.unhook!, to
  • ActiveSupport::Dependencies::Loadable.exclude_from(Object)
    which then calls
        Object.class_eval do
          # ...
          define_method(:require, Kernel.instance_method(:require))
          private :require
        end

So now the question is: who resets Kernel#require ? (on a non-affected rails c at that point it's still from Zeitwerk's source, on the affected bundle exec it's nil)


Turns out it's Bundler, during a call to setup, but only because despite having already been called, its @setup variable isn't populated. (once again comparing with a non-affected rails c confirms that)

I've also noticed from the numerous logs that Zeitwerk doesn't seem to be coming in at the same time during a rails c call versus the affected bundle exec:

--- tmp/error_calls     2020-01-10 02:09:46.000000000 +0100
+++ tmp/ok_calls        2020-01-10 02:12:57.000000000 +0100
@@ -1,42 +1,41 @@
 ==== BUNDLER SETUP v2.0.2 - Bundler.object_id: <doesn't change>
 @setup.present? = false
 @groups.empty? = true
 ====
-=[ZEITWERK/PRE-MODULE]
-=[ZEITWERK/POST-MODULE]
-=[KARAFKA_BOOT_FILE/PRE-REQ]
-=[ENVIRONMENT/PRE-REQ_REL]
 =[APPLICATION/PRE-REQ_REL]
 =[BOOT/PRE-REQ_BUNDLER]
 =[BOOT/POST-REQ_BUNDLER]
 =[BOOTSNAP/PRE-MODULE]
 =[BOOTSNAP/POST-MODULE]
 =[ACTIVESUPPORT/PRE-LOADABLE]
 =[ACTIVESUPPORT/POST-LOADABLE]
 =[BOOT/POST-REQ_BOOTSNAP]
 =[APPLICATION/PRE-REQ]
 =[APPLICATION/PRE-BUNDLER]
 =[BUNDLER/PRE-SETUP]
 ==== BUNDLER SETUP v2.0.2 - Bundler.object_id: <doesn't change>
-@setup.present? = false
+@setup.present? = true
 @groups.empty? = false
 ====
 =[BUNDLER/PRE-REQUIRE]
+=[ZEITWERK/PRE-MODULE]
+=[ZEITWERK/POST-MODULE]
 =[BUNDLER/POST-REQUIRE]
 =[APPLICATION/POST-BUNDLER]
+=[ENVIRONMENT/PRE-REQ_REL]
 =[ENVIRONMENT/POST-REQ_REL]
 =[RAILTIES/PRE-INITIALIZERS]
 =[ ...INITIALIZERS... ]
 =[LET_ZEITWERK_TAKE_OVER/PRE-RUN]
 =[TAKE_OVER/PRE-INTEG]
 =[TAKE_OVER/PRE-SETUP]
 =[TAKE_OVER/PRE-FREEZE]
 =[TAKE_OVER/PRE-DECORATE]
 =[DECORATE/PRE-UNHOOK]
 =[ZEITWERK/PRE-CONST-EXCLUDE]
 =[ZEITWERK/PRE-LOAD-EXCLUDE]
 =[LOADABLE_EXCLUDE/PRE-LOAD]
 =[LOADABLE_EXCLUDE/PRE-REQUIRE]
 =[LOADABLE_EXCLUDE/POST-REQUIRE]
 =[ZEITWERK/POST-LOAD-EXCLUDE]

(I've committed all of those puts in the with_logs branch on the example app - even if that doesn't include the ones in Bundler, Bootsnap, ActiveSupport and Zeitwerk as they're not part of the repo.)

But I haven't figured out why yet. Throwing in the towel for tonight, I might try to run it through ruby-prof this weekend to get a better idea of the path it takes to get there.

Sorry this comment got a bit long, I've kind of been using it as a debugging log. But I figure it might be useful to anyone willing to look into it.

@fxn
Copy link
Owner

fxn commented Feb 14, 2020

Hey!

Zeitwerk decorates require, if some other code replaces require with a version that does not contain our decoration, that's not a bug in Zeitwerk.

This issue may be useful in a text search, and I'm curious and will personally try to understand what is happening, but since this is not a bug in Zeitwerk ---which routinely works together with Rails and Bundler,--- and since I like the issues and PRs counter to be 0 in this project, I prefer to close.

@fxn fxn closed this as completed Feb 14, 2020
@Aquaj
Copy link
Author

Aquaj commented Nov 18, 2020

Hi !
Looping back on this in case people stumble on the issue through search engines: this bug was ultimately fixed in Bundler's code, so you should consider updating that if you're encountering it — if you can't, I advise looking at the workarounds mentioned in other earlier comments.
Here's the issue I had opened on bundler/bundler back then: [Github] and [the PR for the fix] if people are interested.

@swrobel
Copy link

swrobel commented Nov 18, 2020

@Aquaj what version of bundler has the fix?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants