New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove use of json gem #2479
Remove use of json gem #2479
Conversation
For now, it only handles Arrays of Integers, but we'll extend it to support all of the common types
Fixes a bug where requesting `/stats` from the status server would cause subsequent phased restarts to fail when upgrading/downgrading the json gem.
These were disabled at some point on JRuby, but they seem to run fine. Importantly, this test ensures that a call to `/gc-stats` returns well-formed JSON on JRuby, where the value of `GC.stat` contains nested structures.
Fixes a bug where requesting `/gc-stats` from the status server would cause subsequent phased restarts to fail when upgrading/downgrading the json gem.
Fixes a bug where requesting `/thread-backtraces` from the status server would cause subsequent phased restarts to fail when upgrading/downgrading the json gem.
Fixes a bug where accessing `Puma.stats` would cause subsequent phased restarts to fail when upgrading/downgrading the json gem.
fb660cc
to
e6547fc
Compare
@@ -0,0 +1,85 @@ | |||
# frozen_string_literal: true | |||
|
|||
module Puma |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You really went above and beyond here. Better than what we had in 3.x by a mile!
100% for the approach, will do a line by line review soon |
e6547fc
to
8bab708
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great!
The integration tests you added should prevent the specific scenarios that currently make it impossible to upgrade/downgrade json
during a phased restart. If I understand correctly, though, the test suite could still pass even if someone were to introduce a new usage of json
in some other code path, right?
I'm not sure there's much we can do about that (unless we want to ban the usage of require 'json'
in general...) but just wanted to confirm that we'd still want to be on the lookout for new usages of json
that could cause problems with phased restarts.
@@ -40,6 +40,8 @@ def serialize_value(output, value) | |||
serialize_string output, value | |||
when Integer | |||
output << value.to_s | |||
else | |||
raise SerializationError, "Unexpected value of type #{value.class}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add a test for this case? (We already have tests for the other two SerializationError
cases.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch! 806f5c4
History.md
Outdated
@@ -13,6 +13,7 @@ | |||
* Ignore illegal (by Rack spec) response header ([#2439]) | |||
* Close idle connections immediately on shutdown ([#2460]) | |||
* Fix some instances of phased restart errors related to the `json` gem (#2473) | |||
* Remove use of `json` gem to fixed phased restart errors (#2479) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor typo here:
* Remove use of `json` gem to fixed phased restart errors (#2479) | |
* Remove use of `json` gem to fix phased restart errors (#2479) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed up in 0bd5531
Co-authored-by: rmacklin <1863540+rmacklin@users.noreply.github.com>
bc37311
to
9317230
Compare
9317230
to
37641b5
Compare
37641b5
to
28915e7
Compare
lib/puma/json.rb
Outdated
case value | ||
when Array, Hash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this needed? serialize_value
already handles all types or raises a SerializationError
if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think my implementation was based on an understanding of JSON according to RFC 4627, which requires that a valid JSON text is either an object or any array, but it looks like that requirement was dropped in RFC 7158. I'll update the implementation since now there's no need to handle Hash
and Array
in any special way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in 8d05a2f
lib/puma/json.rb
Outdated
def generate(value) | ||
case value | ||
when Array, Hash | ||
parts = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than an array, should it use StringIO? Not sure if there is a (performance) benefit for one or the other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I considered using stringio
at first since it seems like a nice way to construct a string incrementally without an explosion in extra space in that same way that the Array#<<
and Array#join
strategy does. I went with just a normal Array
because even stringio
isn't in Ruby Core and, just like json
, users can technically change the version of stringio
they're using in a phased restart, which would cause workers to fail to boot. That said puma
already uses stringio
internally, so this is already a bug, so I could go either way on this.
I can't imagine the performance improvement would be too significant, especially since again, this module isn't really on a hot path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason I'd switch to stringio
is if it would improve readability/maintainability. I'll play with it and see what it looks like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Switched to StringIO in f818d65.
I ended up doing this after all because I think using the Array
approach I was doing before stood out as unusual after thinking about it. If someone was trying to build a String incrementally while keep allocations to a minimum, a reader of the code would expect the author to have used stringio
. Deviating from that expectation would suggest that the distinction was significant, but in this case, it isn't.
class SerializationError < StandardError; end | ||
|
||
class << self | ||
def generate(value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to attempt using the native JSON
module for those who don't use phased restarts? It'd be faster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the cost of doing that might not be worth the benefit. Use of JSON serialization in puma isn't on a "hot" path, and there would have to be additional complexity to pass around the information such that Puma::JSON
(or its callers) knew when it was appropriate to use the built-in ::JSON
or the new Puma::JSON
. If we get it wrong somewhere, that's an opportunity to re-introduce a regression.
lib/puma/json.rb
Outdated
when Symbol | ||
serialize_string output, value.to_s | ||
when String | ||
serialize_string output, value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the past I used inspect for strings and symbols. Maybe replace serialize_string
with inspect
:
when Symbol, String
output << value.to_s.inspect
or
when Symbol
output << value.to_s.inspect
when String
output << value.inspect
I think it does all the needed processing?
EDIT: Just noticed the call ro serialize_string
in serialize_value
, that needs to be replaced also. Just tried the test locally, and inspect
seems fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll dig deeper into this. I realized I'm not handling all of the different escape sequences correctly. Newline characters, for example, are written literally instead of as \n
.
I'll have to double check that everything that .inspect
emits is valid JSON. If it is, I'll use it, otherwise, I'll handle the special characters in a different way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Your comment said
# The implementation of JSON serialization in this module is not designed to
# be particularly full-featured or fast. It just has to handle the few places
# where Puma relies on JSON serialization internally.
I suspect you know better than any of us what Puma sends to JSON.generate
(having just reviewed it), so I'll trust your judgement as to what edge cases it should handle.
EDIT: I just checked (not exhaustively) arrays composed of strings and/or numbers, and inspect
is equivalent to JSON.generate
with a space between members...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added 98b3408 to handle control characters correctly and added 28e9d8f to simplify the handling of String
and Symbol
.
AFAICT, there's no guarantee that .inspect
produces valid JSON (or will always produce valid JSON into the future), and adding support for control characters explicitly ended up not being so bad. I think the cost of adding the cases for the different characters is small, and it makes the implementation more closely match the language of the specification. I prefer that to having to worry about some edge case down the road where .inspect
produces something that JSON parsers don't like.
The pervious implementation was based on and older JSON standard which defined JSON texts to be either objects or arrays. Modern JSON standands allow all JSON values to be valid JSON texts.
513aab1
to
f818d65
Compare
require_relative "helper" | ||
require "puma/json" | ||
|
||
class TestJSON < Minitest::Test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For all of these, does it make sense to assert JSON.parse(Puma::JSON.generate(value)) == value
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ensuring that Puma::JSON.generate
returns strings that can by parsed by ::JSON.parse
seems useful, for sure.
Updated in b905c34
`::JSON.parse` doesn't handle JSON texts other than objects and arrays in old versions
test/test_json.rb
Outdated
|
||
private | ||
|
||
def assert_puma_json_generates_string(expected_output, value_to_serialize, expected_roundtrip: value_to_serialize) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've never seen this before. Does value_to_serialize
refer to the value that was passed in as the second argument?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to work that way, but I couldn't find that behavior being explicitly supported in any official Ruby documentation, so I made it a little more clear in 0bb2df3
I think I've addressed all of the feedback so far. Lemme know if there's anything else we should do! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a complex PR so I may have missed something, but overall I think it's a good approach.
Description
Fixes #2471. The changes proposed in this PR in particular are described in a comment on the same issue. This PR introduces a new
Puma::JSON
module, which is meant to handle the specific places where Puma needs to perform JSON serialization.Your checklist for this pull request
[changelog skip]
or[ci skip]
to the pull request title.[ci skip]
to the title of the PR.#issue
" to the PR description or my commit messages.