New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement strict argument checking #5071
Conversation
validate is the right method. You're missing Hash, Array and any notion of recursive checking. I found it simpler to use a real JSON roundtrip i.e. the check that's commented out. I think that combined with your global flag is a better solution. |
of confirming the safety of job argument payloads. Cleanup commented-out code from a few years back. Co-authored-by: Eda Zhou <eda.zhou@gusto.com>
Sounds good. We've updated the PR to address the feedback! |
I'm worried about the equality check. Does that correctly handle deep elements, e.g. an Array of Symbols? |
Let's find out. We'll write a few more interesting test cases. |
Co-authored-by: Eda Zhou <eda.zhou@gusto.com>
…se/dump approach and deep structures Co-authored-by: Eda Zhou <eda.zhou@gusto.com>
Test cases improved. It looks like the JSON.parse/dump approach works to prevent things like symbols either as values or keys be serialized, while letting JSON-friendly hashes/arrays pass through. |
Ok, now:
|
My thinking: It should be configurable in 6.x but maybe become the default in
Seems like it could be a nice way of simplifying the API, but I can go either way. My assumption here is that calling
Yep. I'll push a commit to include a note about that once a decision is made for (2).
For sure in the Best Practices section about simple arguments. If we go with the 6.x/7.x split in (1), the Best Practice can read something like:
If we make it the default in 7.x, that message will need to change a bit. |
Maybe a WARN level log in 6.4+ and raise in 7+? We're really building a hyper-specialized linter here. Is this better off as a Rubocop cop? What do you think about warning about larger argument sizes, e.g. greater than 16kb? |
Sounds good. I'll make it so.
The check we're doing is a bit more of a runtime check (whereas I think of linters as static checks), but I think the ideal place for this type of check is statically. I believe Sorbet/RBS definitions would be the best place to enforce this type of thing, since they could do so without actually running the code. Probably something to figure out for the future, though, as the community stabilizes around a type system.
Absolutely. I was going to propose that in a future PR :) Happy to roll it into this one if that makes more sense. For now, I'll get things tidied up based on the conversation so far. |
…d how to enable strict_mode and the best practice
Alright, ready for another review at your leisure. Let me know if you'd like me to squish the commits on this PR, since we've accumulated quite a few. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking pretty good. Do we need a way for the user to disable the check completely (to restore current behavior?). Consider if they have an app which breaks the best practices but their code handles it.
Co-authored-by: Eda Zhou <eda.zhou@gusto.com>
Co-authored-by: Eda Zhou <eda.zhou@gusto.com>
…ead of all at once Co-authored-by: Eda Zhou <eda.zhou@gusto.com>
…imple Co-authored-by: Eda Zhou <eda.zhou@gusto.com>
Perhaps we're panicking for no reason and the option to disable it will stay available? The Changelog is a bit confusing for me...
Would |
It will remain available for backwards compatibility reasons. |
Thanks @mperham. That's very reassuring. 👍 |
Am I missing something or am I in the wrong section: warning in logs:
Still clogging up my logs, and this is just from live polling on the dashboard. |
@NemyaNation Why are you commenting on this issue? |
I'm kind of with @bnorton and @gingerlime on this. I can understand the necessity of wanting to ensure users don't send along a whole Ruby object, but the strict enforcement on hashes feels kind of hamfisted. As the change exists currently, I have to go through and change hundreds of lines of code, because a
that is called like this
is required to be set up like this
Even though the output of the hash in both cases with
But the latter format is no longer Ruby idiomatic and is largely discouraged over utilizing the It feels like the |
No one is forcing anyone to use the arg checking. You can use |
Right, I'm aware of the initializer flag that can be set, but that still leaves the fact that the argument checking will probably be somewhat confusing to individuals who are passing valid JSON serializable arguments, that the wiki says are acceptable, and getting errors about them. Likewise, that is negating the protection of potentially passing actual invalid objects as well. |
If the wiki says they are acceptable, let's fix the wiki. Symbols are not valid JSON. |
Nor are the hash rockets either. The thing is in this case is not what is valid in JSON, the problem is what is being expected to consider the object valid. Not valid JSON, not accepted: Not valid JSON, accepted: Neither of these are valid JSON objects. If we were going on only accepting valid JSON objects, then only a variant such as this should pass
|
Let's be clear here:
|
Yeah, I get that, but so does
There is no scenario I'm aware of where
|
Just postulating, but would it make sense perhaps to check the validity of parse against both iterations (with and without
|
The symbolized keys have bitten us a few times before, which is why we introduced this check. While hashes with keys for symbols serializes safely to JSON, it does not deserialize to the same value. Here's a code snippet to illustrate this: class MyWorker
include Sidekiq::Worker
def perform(hash)
id = hash[:id] # will never exist, but hash['id'] will!
# ...
end
end
MyWorker.perform_async({ id: 123 }) #=> becomes the JSON "{ \"id\": 123 }" |
@kellysutton yeah, I can certainly understand that scenario. Unfortunately that is one of the things where I think it makes sense to say "your received arguments will be always string keyed hashes", but I'm not sure it makes sense to say "you have to adhered to this explicit format for sending arguments," especially when, as noted above, the format doesn't actually dictate the validity of the object serialization, instead just being an arbitrary way that just happens to match how the |
Also, I just want to note that I'm not trying to argue this point out of a sense of stubbornness or anything of the sort. I'm advocating for this perspective because I do feel like this change will cause an undue burden and confusion on the end user, especially with the change several ruby versions back that the hash rocket was the less favored way of generating hashes over the |
* Fix Sidekiq warnings about JSON serialization This occurs on every symbol argument we pass, and every symbol key in hashes, because Sidekiq expects strings instead. See sidekiq/sidekiq#5071 We do not need to change how workers parse their arguments because this has not changed and we were already converting to symbols adequately or using `with_indifferent_access`. * Set Sidekiq to raise on unsafe arguments in test mode In order to more easily catch issues that would produce warnings in production code.
Sidekiq 6.4 has started to log warnings (sidekiq/sidekiq#5071) if the job arguments are not strictly JSON-safe i.e. serializing to JSON and deserializing from JSON yield consistent outputs. To what it means, we cannot use a hash with symbol keys since doing the JSON rountrip will change them all to string keys. This fix will do the JSON round trip for the job arguments beforehand so Sidekiq won't complain it (and break it starting version 7.0)
Sidekiq 6.4 has started to log warnings (sidekiq/sidekiq#5071) if the job arguments are not strictly JSON-safe i.e. serializing to JSON and deserializing from JSON yield consistent outputs. To what it means, we cannot use a hash with symbol keys since doing the JSON rountrip will change them all to string keys. This fix will do the JSON round trip for the job argument beforehand so Sidekiq won't complain it (and break it starting version 7.0)
Sidekiq 6.4 has started to log warnings (sidekiq/sidekiq#5071) if the job arguments are not strictly JSON-safe i.e. serializing to JSON and deserializing from JSON yield consistent outputs. To what it means, we cannot use a hash with symbol keys since doing the JSON rountrip will change them all to string keys. This fix will transform the hash argument with symbol keys to string keys so Sidekiq won't complain it (and break it starting version 7.0).
Sidekiq 6.4 has started to log warnings (sidekiq/sidekiq#5071) if the job arguments are not strictly JSON-safe i.e. serializing to JSON and deserializing from JSON yield consistent outputs. To what it means, we cannot use a hash with symbol keys since doing the JSON rountrip will change them all to string keys. This fix will transform the hash argument with symbol keys to string keys so Sidekiq won't complain it (and break it starting version 7.0).
It's a year old thread but maybe somebody could shed a bit of light on this for me as this seems strictly related?
I pass a simple, two k/v pairs Info says the version is: |
@silverdr I ran into something similar that looked odd initially, turns out I was serializing a BigDecimal instance as a value when it looked like a string. Open a Rails console, set a breakpoint (ruby/debug or byebug: |
This PR implements a runtime type check for arguments as a follow-up from the discussion here: #5070. The idea is to provide some safeguards in non-production environments for folks about to do something dangerous (i.e. use arguments in a job that cannot (de)serialize reliably to/from JSON).
Open Questions:
#validate
or#normalize_item
the right place for this logic?perform_async
is called with poorly serializable arguments? #2870?Notes:
perform_async
is called with poorly serializable arguments? #2870 and ...