Skip to content
This repository has been archived by the owner on Apr 14, 2021. It is now read-only.

Bundler 1.16.2 regression: gemspec referencing code with non-ascii characters requires "encoding: UTF-8" magic comment #6598

Closed
ivoanjo opened this issue Jun 24, 2018 · 14 comments · Fixed by #6599

Comments

@ivoanjo
Copy link

ivoanjo commented Jun 24, 2018

Hello there and thanks for Bundler!

I'm the author of the Persistent-💎 gem, which uses as its namespace a module Persistent💎.

The gemspec for the gem thus includes:

Gem::Specification.new do |spec|
  spec.name          = 'persistent-dmnd'
  spec.version       = Persistent💎::VERSION

Because I support this gem also on Ruby 1.9.3, I've added a # encoding: UTF-8 at the top of every file so Ruby correctly handles the emoji.

But, while debugging a related TruffleRuby issue (see oracle/truffleruby#1376) we discovered that whereas with Bundler 1.16.1 the gem's gemspec file works correctly on Ruby 2.0->2.6 without a # encoding: UTF-8 comment at the top of the file:

$ ruby -v
ruby 2.5.1p57 (2018-03-29 revision 63029) [x86_64-linux]
$ bundle -v
Bundler version 1.16.1
$ git diff
diff --git a/persistent-dmnd.gemspec b/persistent-dmnd.gemspec
index 72c50a6..e4b3dad 100644
--- a/persistent-dmnd.gemspec
+++ b/persistent-dmnd.gemspec
@@ -1,4 +1,3 @@
-# encoding: UTF-8
 
 # Persistent-💎: Ruby gem for easily creating immutable data structures
 # Copyright (c) 2017 Ivo Anjo <ivo.anjo@ist.utl.pt>
$ bundle install
Warning: the running version of Bundler (1.16.1) is older than the version that created the lockfile (1.16.2). We suggest you upgrade to the latest version of Bundler by running `gem install bundler`.
[snip]
Bundle complete! 9 Gemfile dependencies, 18 gems now installed.
Use `bundle info [gemname]` to see where a bundled gem is installed.

...on bundler 1.16.2 this breaks:

$ bundle -v
Bundler version 1.16.2
$ bundle install
Traceback (most recent call last):
        24: .rvm/gems/ruby-2.5.1/bin/ruby_executable_hooks:15:in `<main>'
        23: .rvm/gems/ruby-2.5.1/bin/ruby_executable_hooks:15:in `eval'
        22: .rvm/gems/ruby-2.5.1/bin/bundle:23:in `<main>'
        21: .rvm/gems/ruby-2.5.1/bin/bundle:23:in `load'
        20: .rvm/gems/ruby-2.5.1/gems/bundler-1.16.2/exe/bundle:22:in `<top (required)>'
        19: .rvm/gems/ruby-2.5.1/gems/bundler-1.16.2/lib/bundler/friendly_errors.rb:124:in `with_friendly_errors'
        18: .rvm/gems/ruby-2.5.1/gems/bundler-1.16.2/exe/bundle:30:in `block in <top (required)>'
        17: .rvm/gems/ruby-2.5.1/gems/bundler-1.16.2/lib/bundler/cli.rb:18:in `start'
        16: .rvm/gems/ruby-2.5.1/gems/bundler-1.16.2/lib/bundler/vendor/thor/lib/thor/base.rb:466:in `start'
        15: .rvm/gems/ruby-2.5.1/gems/bundler-1.16.2/lib/bundler/cli.rb:27:in `dispatch'
        14: .rvm/gems/ruby-2.5.1/gems/bundler-1.16.2/lib/bundler/vendor/thor/lib/thor.rb:387:in `dispatch'
        13: .rvm/gems/ruby-2.5.1/gems/bundler-1.16.2/lib/bundler/vendor/thor/lib/thor/invocation.rb:126:in `invoke_command'
        12: .rvm/gems/ruby-2.5.1/gems/bundler-1.16.2/lib/bundler/vendor/thor/lib/thor/command.rb:27:in `run'
        11: .rvm/gems/ruby-2.5.1/gems/bundler-1.16.2/lib/bundler/cli.rb:223:in `install'
        10: .rvm/gems/ruby-2.5.1/gems/bundler-1.16.2/lib/bundler/settings.rb:136:in `temporary'
         9: .rvm/gems/ruby-2.5.1/gems/bundler-1.16.2/lib/bundler/cli.rb:224:in `block in install'
         8: .rvm/gems/ruby-2.5.1/gems/bundler-1.16.2/lib/bundler/cli/install.rb:62:in `run'
         7: .rvm/gems/ruby-2.5.1/gems/bundler-1.16.2/lib/bundler.rb:135:in `definition'
         6: .rvm/gems/ruby-2.5.1/gems/bundler-1.16.2/lib/bundler/definition.rb:35:in `build'
         5: .rvm/gems/ruby-2.5.1/gems/bundler-1.16.2/lib/bundler/dsl.rb:12:in `evaluate'
         4: .rvm/gems/ruby-2.5.1/gems/bundler-1.16.2/lib/bundler/dsl.rb:42:in `eval_gemfile'
         3: .rvm/gems/ruby-2.5.1/gems/bundler-1.16.2/lib/bundler/dsl.rb:51:in `rescue in eval_gemfile'
         2: .rvm/gems/ruby-2.5.1/gems/bundler-1.16.2/lib/bundler/dsl.rb:51:in `message'
         1: .rvm/gems/ruby-2.5.1/gems/bundler-1.16.2/lib/bundler/dsl.rb:51:in `to_s'
.rvm/gems/ruby-2.5.1/gems/bundler-1.16.2/lib/bundler/dsl.rb:575:in `to_s': incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)
@deivid-rodriguez
Copy link
Member

Hello @ivoanjo! Maybe #6279 is the culprit... Can you confirm?

@ivoanjo
Copy link
Author

ivoanjo commented Jun 24, 2018

Thanks for the reply @deivid-rodriguez . I can confirm that reverting the call to read_file(file) back to path.read in load_gemspec_uncached fixed it, so the regression defnitely came from there.

@deivid-rodriguez
Copy link
Member

I think reverting the fix could be a good solution since the error message is more clear now. However, the spec I added to prove my fix hangs if I revert it, so I think we need to investigate why that happens before considering that.

@deivid-rodriguez
Copy link
Member

Ok, So I had a look at this.

Reverting #6279, and adding a rescue to the exception formatting method seems to stop the hang on bundler's master and give a meaninful error message instead, and keeps things working as they are now on 1-16 stable (it just works).

However, I have not been able to reproduce your issue (incompatible character encodings: UTF-8 and ASCII-8BIT (Encoding::CompatibilityError)) unless I explicitly introduce a syntax error on a file with a UTF-8 name. 🤔

This is the diff I'm working with (the new spec added passes on both branches and fails without the changes in lib, the old one passes on master but fails on 1.16-stable because the old behavior is kept on this branch).

Not sure why this difference between master and 1.16-stable, whether it is acceptable, and whether the rest of the specs pass with these changes (running them now).

diff --git a/lib/bundler.rb b/lib/bundler.rb
index b7002db99..010b26f3b 100644
--- a/lib/bundler.rb
+++ b/lib/bundler.rb
@@ -427,7 +427,7 @@ EOF
 
     def load_gemspec_uncached(file, validate = false)
       path = Pathname.new(file)
-      contents = read_file(file)
+      contents = path.read
       spec = if contents.start_with?("---") # YAML header
         eval_yaml_gemspec(path, contents)
       else
diff --git a/lib/bundler/dsl.rb b/lib/bundler/dsl.rb
index 868116327..d76ffbda6 100644
--- a/lib/bundler/dsl.rb
+++ b/lib/bundler/dsl.rb
@@ -588,6 +588,8 @@ The :#{name} git source is deprecated, and will be removed in Bundler 2.0.#{addi
           description = description.sub(/#{Regexp.quote trace_line}:\s*/, "").sub("\n", " - ")
         end
         [trace_line, description]
+      rescue
+        [nil, description]
       end
     end
 
diff --git a/spec/install/gemspecs_spec.rb b/spec/install/gemspecs_spec.rb
index 5b6778bdd..a435635a8 100644
--- a/spec/install/gemspecs_spec.rb
+++ b/spec/install/gemspecs_spec.rb
@@ -46,7 +46,7 @@ RSpec.describe "bundle install" do
     expect(the_bundle).to include_gems "activesupport 2.3.2"
   end
 
-  it "does not hang when gemspec has incompatible encoding" do
+  it "shows a friendly error when gemspec has incompatible encoding" do
     create_file "foo.gemspec", <<-G
       Gem::Specification.new do |gem|
         gem.name = "pry-byebug"
@@ -60,7 +60,38 @@ RSpec.describe "bundle install" do
       gemspec
     G
 
-    expect(out).to include("Bundle complete!")
+    expect(out).to include("[!] There was an error while loading `foo.gemspec`: invalid multibyte char (US-ASCII)")
+    expect(out).to match(/ #  from .*\/foo.gemspec:4/)
+    expect(out).to include(" #  -------------------------------------------")
+    expect(out).to include(' #    gem.version = "3.4.2"')
+    expect(out).to include(' >    gem.author = "David Rodríguez"')
+    expect(out).to include(' #    gem.summary = "Good stuff"')
+    expect(out).to include(" #  -------------------------------------------")
+  end
+
+  if RUBY_VERSION >= "2.0"
+    it "shows a friendly error when syntax errors happen on files with UTF-8 names" do
+      create_file("💎.rb", <<-RUBY)
+        VERSION = "0.0.1
+      RUBY
+
+      create_file("persistent-dmnd.gemspec", <<-G)
+        require_relative "💎"
+
+        Gem::Specification.new do |gem|
+          gem.name = "persistent-dmnd"
+          gem.version = VERSION
+          gem.author = "Ivo Anjo"
+          gem.summary = "Unscratchable stuff"
+        end
+      G
+
+      install_gemfile <<-G
+        gemspec
+      G
+
+      expect(out).to include("unterminated string meets end of file")
+    end
   end
 
   context "when ruby version is specified in gemspec and gemfile" do

@deivid-rodriguez
Copy link
Member

The changes don't seem to break any other specs. Any ideas on the different behavior between master and 1.16-stable?

@ivoanjo
Copy link
Author

ivoanjo commented Jun 24, 2018

Thanks you @deivid-rodriguez for digging down so fast into the issue!
Hopefully a fix is found that solves both issues.

As for reproducing it, I can reliably reproduce by cloning the repository and changing the gemspec. If it helps, I'm using Ubuntu 17.10 and rvm to set up Ruby.

@deivid-rodriguez
Copy link
Member

deivid-rodriguez commented Jun 24, 2018

Yeah, me too. I meant that I haven't been able to reproduce with a spec.

@eregon
Copy link
Contributor

eregon commented Jun 24, 2018

Note that it's a bit tricky to reproduce the issue with the Persistent💎 gemspec above.
One way is to have an external file (say version.rb) loaded by require like this:

module Persistent💎
  VERSION = '1.0.2'
end

and then the gemspec reads from that constant with Persistent💎::VERSION:

require_relative 'version.rb'

Gem::Specification.new do |spec|
  spec.name          = 'persistent-dmnd'
  spec.version       = Persistent💎::VERSION
  spec.authors       = 'Ivo Anjo'
  spec.email         = 'ivo.anjo@ist.utl.pt'
end

The required file has implicitly an encoding of UTF-8, while the gemspec got a binary encoding when using read_file, and so both Persistent💎 end up referring to different things in that context, leading to an error.

Note that path.read likely depends on the locale encoding, which seems unreliable.
Probably path.read(encoding: Encoding::UTF_8) is better and then the result does no longer depend on the current locale.
That means Bundler would arbitrarily set the source encoding of gemspec to UTF-8, if there is no magic encoding comment, but that seems acceptable and more reliable than the locale encoding.

Finally, it's worth noting the original gemspec should have the magic encoding comment, to work on older Bundler and with a locale != UTF-8 (I guess it would fail without the magic encoding comment in that situation).

@deivid-rodriguez
Copy link
Member

Yeah, I did try exactly that, and I was getting that the constant Persistent💎 was undefined. That seems consistent with your explanation, right?

@eregon
Copy link
Contributor

eregon commented Jun 24, 2018

Yes, that's expected and shows the regression.
The Encoding::CompatibilityError above is just an error while trying to format the missing constant error message nicely IIRC.

@deivid-rodriguez
Copy link
Member

And explicitly passing the encoding to path.read fixed the spec on master too! Thanks for that tip! :) I'll open a PR shortly.

bundlerbot added a commit that referenced this issue Jun 26, 2018
…n, r=colby-swandale

Respect encodings when reading gemspecs

This PR fixes #6598. Thanks @eregon for the help and the explanation that helped me understand the issue :)!

### What was the end-user problem that led to this PR?

On gems using UTF-8 symbols defined in other files in their gemspecs, `bundle install` would crash.

### What was your diagnosis of the problem?

The problem was that since #6279 gemspecs are read binarily. However, files required from them as read as text. That means that the constants defined on those files and used in the gemspec are interpreted different and thus don't match.

### What is your fix for the problem, implemented in this PR?

My fix is to go back to reading gemspec as text again. Explictly passing the encoding when reading them still fixes the problem that the PR introducing the regression was meant to fix.

### Why did you choose this fix out of the possible options?

I chose this fix because it fixes the problem, and keeps the situation that the PR introducing the regression fixed also working.
@eregon
Copy link
Contributor

eregon commented Jun 26, 2018

Thank you for the quick fix @deivid-rodriguez!

@deivid-rodriguez
Copy link
Member

No problem, it was my regression after all 😅

@ivoanjo
Copy link
Author

ivoanjo commented Jun 27, 2018

+1 Thanks for the fix @deivid-rodriguez and hooray for much improved unicode support after your two PRs!

colby-swandale pushed a commit that referenced this issue Jul 10, 2018
…n, r=colby-swandale

Respect encodings when reading gemspecs

This PR fixes #6598. Thanks @eregon for the help and the explanation that helped me understand the issue :)!

### What was the end-user problem that led to this PR?

On gems using UTF-8 symbols defined in other files in their gemspecs, `bundle install` would crash.

### What was your diagnosis of the problem?

The problem was that since #6279 gemspecs are read binarily. However, files required from them as read as text. That means that the constants defined on those files and used in the gemspec are interpreted different and thus don't match.

### What is your fix for the problem, implemented in this PR?

My fix is to go back to reading gemspec as text again. Explictly passing the encoding when reading them still fixes the problem that the PR introducing the regression was meant to fix.

### Why did you choose this fix out of the possible options?

I chose this fix because it fixes the problem, and keeps the situation that the PR introducing the regression fixed also working.

(cherry picked from commit cb18acc)
netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this issue Sep 23, 2018
## 1.16.5 (2018-09-18)

Changes:

  - Add support for TruffleRuby (@eregon)

Bugfixes:

  - Avoid printing git errors when checking the version on incorrectly packaged versions of Bundler ([#6453](rubygems/bundler#6453), @greysteil)
  - Fix issue where Bundler does not check the given class when comparing equality in DepProxy (@ChrisBr)
  - Handle `RangeNotSatisfiable` error in Compact Index (@MaxLap)
  - Check for initialized `search` variable in `LazySpecification` (@voxik)
  - Fix LoadError occurring in nested bundle exec calls ([#6537](rubygems/bundler#6537), @colby-swandale)
  - Check that Bundler::Deprecate is not an autoload constant ([#6163](rubygems/bundler#6163), @eregon)
  - Prefer non-pre-release versions when performing a `bundle update --patch` ([#6684](rubygems/bundler#6684), @segiddins)

## 1.16.4 (2017-08-17)

Changes:

  - Welcome new members to the Bundler core team (@indirect)
  - Don't mutate original error trees when determining version_conflict_message (@greysteil)
  - Update vendored Molinillo to 0.6.6 (@segiddins)

Bugfixes:

  - Reword bundle update regression message to be more clear to the user when a gem's version is downgraded ([#6584](rubygems/bundler#6584), @ralphbolo)
  - Respect --conservative flag when updating a dependency group ([#6560](rubygems/bundler#6560), @greysteil)
  - Fix issue where a pre-release version was not being selected when it's specified in the Gemfile ([#6449](rubygems/bundler#6449), @akihiro17)
  - Fix issue where `Etc` was not loaded when getting the user's home dir ([#6640](rubygems/bundler#6640), @colby-swandale)
  - Use UTF-8 for reading files including Gemfile ([#6660](rubygems/bundler#6660), @eregon)
  - Remove unnecessary `while` loop in path resolver helper (@ojab)

Documentation:

  - Document that `bundle show [--paths]` sorts results by name (@kemitchell)

## 1.16.3 (2018-07-17)

Features:

  - Support URI::File of Ruby 2.6 (@hsbt)

Bugfixes:

  - Expand symlinks during setup to allow Bundler to load correctly when using symlinks in $GEM_HOME ([#6465](rubygems/bundler#6465), @ojab, @indirect)
  - Dont let Bundler create temporary folders for gem installs which are owned by root ([#6258](rubygems/bundler#6258), @colby-swandale)
  - Don't fallback to using temporary directories when needed directories already exist ([#6546](rubygems/bundler#6546), @brodock)
  - Use SharedHelpers.filesystem_access when reading a Gemfile so friendly error messages can be given to the user ([#6541](rubygems/bundler#6541), @segiddins)
  - Check if source responds to `#remotes` before printing gem install error message ([#6211](rubygems/bundler#6211), @colby-swandale)
  - Handle Errno::ENOTSUP in the Bundler Process Lock to prevent exceptions when using NFS mounts ([#6566](rubygems/bundler#6566), @colby-swandale)
  - Respect encodings when reading gemspecs ([#6598](rubygems/bundler#6598), @deivid-rodriguez)

Documentation:

  - Fix links between manual pages (@BanzaiMan)
  - Add warning to Gemfile documentation for the use of the `source` option when declaring gems ([#6280](rubygems/bundler#6280), @forestgagnon)

## 1.16.2 (2018-04-20)

Changes:

  - Include the gem's source in the gem install error message when available (@papanikge)
  - Remove unnecessary executable bit from gem template (@voxik)
  - Dont add the timestamp comment with gems added to the Gemfile via `bundle add` ([#6193](rubygems/bundler#6193), @cpgo)
  - Improve yanked gem error message (@alyssais)
  - Use `Bundler.rubygems.inflate` instead of the Gem::Util method directly (@segiddins)
  - Remove unused instance variable (@segiddins)

Bugfixes:

  - Only trap INT signal and have Ruby's signal default handler be invoked (@shayonj)
  - Fix warning about the use of `__FILE__` in RubyGems integration testing (@MSP-Greg)
  - Skip the outdated bundler check when MD5 is not available ([#6032](rubygems/bundler#6032), @segiddins)
  - Fallback to the original error if the friendly message raises (@segiddins)
  - Rename Bundler.frozen? to avoid Object method conflict ([#6252](rubygems/bundler#6252), @segiddins)
  - Ensure the bindir exists before installing gems (@segiddins)
  - Handle gzip corruption errors in the compact index client ([#6261](rubygems/bundler#6261), @colby-swandale)
  - Check if the current directory is writeable when writing files in `bundle gem` ([#6219](rubygems/bundler#6219), @nilsding)
  - Fix hang when gemspec has incompatible encoding (@deivid-rodriguez)
  - Gracefully handle when the lockfile is missing spec entries for the current platform ([#6079](rubygems/bundler#6079), @segiddins)
  - Use Gem::Util.inflate instead of Gem.inflate (@hsbt)
  - Update binstub generator to use new ERB.new arity in Ruby 2.6 (@koic)
  - Fix `source_location` call in rubygems integration (@MSP-Greg)
  - Use `filesystem_access` when copying files in Compact Index Updater ([#6289](rubygems/bundler#6289), @segiddins)
  - Fail gracefully when resetting git gems to the given revision fails ([#6324](rubygems/bundler#6324), @segiddins)
  - Handle exceptions that do not have a backtrace ([#6342](rubygems/bundler#6342), @nesaulov)
  - Check if stderr was closed before writing to it (@shime)
  - Handle updating a specific gem for a non-local platform ([#6350](rubygems/bundler#6350), @greysteil)
  - Bump the `bundle_binstub` check-length to 300 characters (@tduffield)
  - Fix specifying alterntive Lockfile with `bundle lock` when default gemfile is present  ([#6460](rubygems/bundler#6460), @agrim123)
  - Allow installing dependencies when the path is set to `.`  ([#6475](rubygems/bundler#6475), @segiddins)
  - Support Bundler installing on a readonly filesystem without a home directory ([#6461](rubygems/bundler#6461), @grosser)
  - Filter git uri credentials in source description (@segiddins)

Documentation:

  - Correct typos in `bundle binstubs` man page (@erikj, @samueloph)
  - Update links in `bundle gem` command documentation to use https (@KrauseFx)
  - Fix broken links between bundler man pages (@segiddins)
  - Add man page for the `bundle doctor` command ([#6243](rubygems/bundler#6243), @nholden)
  - Document `# frozen_string_literal` in `bundle init` Gemfile (@315tky)
  - Explain the gemspec files attribute in `bundle gem` template and print a link to bundler.io guides when running `bundle gem` ([#6246](rubygems/bundler#6246), @nesaulov)
  - Small copy tweaks & removed redundant phrasing in the bundler man page (@rubymorillo)
  - Improve the documentation of the settings load order in Bundler (@rubymorillo)
  - Added license info to main README (@rubymorillo)
  - Document parameters and return value of Injector#inject (@tobias-grasse)
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants